Price Ghissick {Editors) as (A tt O 3 Principle Foundations of Quantum Mechanics Edited by William C.Price Seymour S. Chissick WILEY ) The Uncertainty Principle and Foundations of Quantum Mechanics A Fifty Years' Survey Edited by William C. Price, f.r.s. Wheatstone Professor of Physics, University of London King's College Seymour S. Chissick Lecturer in Chemistry, University of London King's College A Wiley-Interscience Publication JOHN WILEY & SONS Chichester • New York ■ Brisbane ■ Toronto CLASS No. £ lo ,12 UriC 2 i SEP 1980 Copyright © 1977, by John Wiley & Sons, Ltd. Reprinted February 1979 All rights reserved. No part of this book may be reproduced by any means, nor transmitted, nor translated into a machine language without the written consent of the publisher. Library of Congress Cataloging in Publication Data: Main entry under title: The Uncertainty principle and foundations of quantum mechanics. 'A Wiley-Interscience publication.' 'A tribute to Professor Werner Heisenberg to commemorate the fiftieth anniversary of the formulation of quantum mechanics.' 1. Quantum theory — Addresses, essays, lectures. 2. Heisenberg uncertainty principle — Addresses, essays, lectures, I. Price, William Charles, 1909- . II. Chissick, Seymour S. III. Heisenberg, Werner, 1901-1976. QC174.125.U5 530.17 / 76-18213 ISBN 471 99414 6 Set on Linotron Filmsetter and Printed in Great Britain by J. W. Arrowsmith Ltd., Bristol A tribute to Professor Werner Heisenberg to commemorate the fiftieth anniversary of the formulation of quantum mechanics. ! | s t List of Contributors Bohm, David Clarke, Christopher J. S. Detrich, John, H. Feldman, Gordon Gudder, Stanley P. Heisenberg, Werner Hodgson, Peter E. Kraus, K. Kuryshkin, Vassili V. Lanz, Ludovico Ludwig, Guenther Mignani, Roberto Papp, Erhardt W. R. Ratner, Mark A. Rayski, Jacek M. Rayski, Jerzy Recami, Erasmo Reece, Gordon Roothaan, Clemens C. J. Ruhl, W. Rylov, Yuri A. Sabin, John R. Santhanam, Thalanayar S. Shwianowski, Jan J. Stenger, William Streit, Ludwig Tassie, Lindsay J. Trickey, Samuel B. Van Horn, Hugh M. Yunn, B. C. (» Dedication PROFESSOR SIR HERMANN BONDI, K.C.B. I A remarkable factor in the progress of science is the temporary concentration of interest on particular topics. Science is above all a social activity, the picture of the lonely scientist being largely a figment of an untutored imagination. Scientists hunt in packs and follow scents. Sometimes the scent is material: when the progress of technology has opened up a whole new method of experimental work, and the ways of using this newly available technique, the assimilation of the results obtained, the formulation of novel hypotheses and their means of being tested all attract a large pack of experimenters and theorists that in full cry produces astonishingly rapidly a large and novel output. On other occasions the scent is intellectual, when an awkward question has been asked and many try to find at least partial answers to it, answers that can often lead to fruitful insights and new and vital problems. One characteristic of this pack hunting is that if a new theory leads in rapid succession to numerous and varied experimental tests, each passed with honours, each leading to yet newer applications, then hardly anybody will stop to examine critically and logically the philosophy and internal consistency of the theory. Nobody will want to do so, because if he finds no flaw, his work will be regarded as insignificant, while if he does find a flaw, his papers will be brushed aside with the comment: "The theory works, so there must be some fault in his argument. Why waste time to sort it out when there are so many more fascinating things to be done?' Thus foundations cannot be coolly examined until well after the main part of the pack has passed the site of the excavations, many years later. This volume brings together many illuminating phases of one of the most exciting and successful hunts in history, the formulation of quantum theory. Not only was this hunt outstanding in the range and wealth of experimental data it covered (including the extension of the applicability of a theory founded on the spectroscopy of atoms to nuclear physics), but also in its philosophical implications. Some of them appeared early, some were later grossly misunder- stood and indeed exaggerated, but many are only now starting to be fully explored and begin to come into focus because only now is the site of the excavations sufficiently unencumbered to allow for deep investigations into problems of foundations. x Dedication Nothing could be more appropriate for such a hunt than to start with Heisenberg's own description of the origin of his celebrated uncertainty relations. Now that he is — alas — no longer with us, particular value attaches to this recent recollection of the most formative phase of modern physics by one of its foremost figures. It is splendid to observe from the many contributions in the first two parts of the volume how lively and active the subjects of foundations and of measure- ment theory now are in so many parts of the world. The final two parts deal with novel aspects of formal theory and of applications, where again we live in a vigorous period of activity. I trust that many will find this book thought provoking, enjoyable and indeed fascinating. tS :l Foreword HANS MATTHOFER Bundesminister fur Forschung und Technologie, F.D.R. Great achievements on the part of researchers are often the result of their having had the courage to leave familiar ground and to explore genuinely unknown fields. The discoverer of the quantum theory and the uncertainty principle was required to leave the solid ground of classical physics. One of the most significant changes in our comprehension of the universe — a change which is reflected in fields far removed from physics — was wrought by the departure from the determinacy of physical phenomena and by far deeper- reaching relativization of the law of causality. The quantum theory and the uncertainty principle are discoveries which have changed the basis of our way of thinking. We still cannot foresee their ultimate consequences. Werner Heisenberg, whose passing we mourned during the preparation of this book, displayed not only the courage to leave the familiar terrain of classical physics. He also possessed the spirit to defend that which has been established as true in his field of science against nationalism and racism, even in the face of the most bitter political oppression. Both during and after the Second World War, he was therefore a guarantor of another Germany which desired peace and reconciliation among the peoples of the world. Preface ,:'*. i> The first thirty years of the twentieth century saw an explosive development in the physical sciences, the like of which it is improbable we shall see again. Many of the discoveries were made by comparatively young men and this has provided opportunities for the international scientific community to com- memorate the fiftieth anniversary of some of the more fundamental discoveries during the lifetimes of their discoverers. This book, dedicated to Professor Werner Heisenberg, is one in a series of books, each designed as a tribute to one of the founders of modern physics. While the book was organized with the cooperation of Professor Heisenberg, it is with deep regret that we learned of his death on 15th February 1976, at the age of 74, just before going to press. This book commemorates the formulation by Heisenberg in the Spring of 1925 of the system of mechanics known as quantum (or matrix) mechanics. The subsequent development of quantum mechanics by Heisenberg with Max Born and Pascual Jordan provided the basis for modern physics. One of Heisen- berg's best known and far reaching contributions to the understanding of quantum mechanics was his Uncertainty Principle, which limits the precision of measurement of the dynamic variables of a system. While Heisenberg's decisive contribution to physics, for which he received the Nobel Prize in 1932, was made at the age of 24, he continued to advance knowledge over a wide range of subjects: nuclear and sub-nuclear physics, S-matrix theory, solid state theory, plasma and thermonuclear physics, unified field theory, etc. In compiling this volume, the editors have again been fortunate in securing the help and cooperation of scientists throughout the world. The aims were essentially similar to those of Wave Mechanics, the First Fifty Years (a tribute to Professor Louis de Broglie on the fiftieth anniversary of the discovery of the wave nature of the electron); to review aspects of the philosophical implica- tions, past and current thinking and potential future developments in physics stemming from the fundamental discoveries associated with, in this case, Werner Heisenberg. The Editors wish to record their thanks to the University of London King's College, for the facilities provided and to Professor David Bohm, Dr. R. J. Griffiths and Dr M. P. Melrose for reading various sections of the manuscript and for making helpful comments. February 1976 William C. Price, F.R.S. Seymour S. Chissick University of London King's College ') Contents PART 1 QUANTUM UNCERTAINTY DESCRIPTION 1 1 Remarks on the Origin of the Relations of Uncertainty 3 Werner Heisenberg 2 In Praise of Uncertainty 7 Gordon Reece 3 On the Meaning of the Time-Energy Uncertainty Relation 13 Jerzy Rayski and Jacek M. Rayski, Jr. 4 A Time Operator and the Time-Energy Uncertainty Relation 21 Erasmo Recami 5 Quantum Theory of the Natural Space-Time Units 29 Erhardt W. R. Papp 6 Uncertain Cosmology 51 Christopher J. S. Clarke 7 Uncertainty Principle and the Problems of Joint Coordinate-Momentum Probability Density in Quantum Mechanics 61 Vassili V. Kuryshkin PART 2 MEASUREMENT THEORY 85 8 The Problem of Measurement in Quantum Mechanics 87 Ludovico Lanz 9 The Correspondence Principle and Measurability of Physical Quantities in Quantum Mechanics 109 Yuri A. Rylov 10 Uncertainty, Correspondence and Quasiclassical Compatability 147 Jan J. Slawianowski xvi Contents 11 A Theoretical Description of Single Microsystems 189 Guenther Ludwig 12 Quantum Mechanics of Bounded Operators 227 Thalanayar S. Santhanam PART 3 FORMAL QUANTUM THEORY 245 13 Four Approaches to Axiomatic Quantum Mechanics 247 Stanley P. Gudder 14 Intermediate Problems for Eigenvalues in Quantum Theory 277 William Stenger 15 Position Observables of the Photon 293 K. Kraus 16 A New Approach and Experimental Outlook on Magnetic Monopoles 321 Erasmo Recami and Roberto Mignani 17 Problems in Conformally Covariant Quantum Field Theory 325 W. Riihl and B. C. Yunn 18 The Construction of Quantum Field Theories 349 Ludwig Streit 19 Classical Electromagnetic and Gravitational Field Theories as Limits of Massive Quantum Theories 365 Gordon Feldman 20 Relativistic Electromagnetic Interaction Without Quantum Elec- trodynamics 395 Clemens C. J. Roothaan and John H. Detrich PART 4 APPLIED QUANTUM MECHANICS 439 21 The Uncertainty Principle and the Structure of White Dwarfs 441 Hugh M. Van Horn 22 Applications of Model Hamiltonians to the Electron Dynamics of Organic Charge Transfer Salts 461 Mark A. Ratner, John R. Sabin and Samuel B. Trickey il 23 Alpha-Clustering in Nuclei Peter E. Hodgson Contents xvii 485 24 Commutation Relations, Hydrodynamics and Inelastic Scattering by Atomic Nuclei 543 Lindsay J. Tassie 25 Heisenberg's Contribution to Physics David Bohm Author Index Subject Index 559 565 567 PART 1 Quantum Uncertainty Description I 1 Remarks on the Origin of the Relations of Uncertainty The late Professor WERNER HEISENBERG Director Emeritus of the Max Planck Institut fur Physik und Astrophysik, Munich, Germany The situation of quantum theory in the summer of 1926 can be characterized by two statements. The mathematical equivalence of matrix mechanics and wave mechanics had been demonstrated by Schrodinger, the consistency of the mathematical scheme could scarcely be doubted; but the physical interpreta- tion of this formalism was still quite controversial. Schrodinger, following the original ideas of de Broglie, tried to compare the 'matter waves' with elec- tromagnetic waves, to consider them as real, measurable waves in three- dimensional space. Therefore he preferred to discuss those cases where the configuration space had only three dimensions (one-particle systems), and he hoped, that the 'irrational' features of quantum theory, especially quantum 'jumps', could be completely avoided in wave mechanics. The stationary states of a system were defined as standing waves, their energy was really the frequency of the waves. Born on the Other hand had used the configuration space of Schrodinger's theory to describe collision processes and he took the square of the wave amplitude in configuration space as the probability of finding a particle. So he emphasized the statistical character of quantum theory without attempting to describe what 'really happens' in space and time. Schrodinger's attempt appealed to many physicists who were not willing to accept the paradoxes of quantum theory; but the discussions with him in July 1926 in Munich and in September in Copenhagen demonstrated very soon, that such a 'continuous' interpretation of wave mechanics could not even explain Planck's law of heat radiation. Since Schrodinger was not quite convinced it seemed to me extremely important to decide beyond any doubt whether or not quantum 'jumps' were an unavoidable consequence, if one accepted that part of the interpretation of matrix mechanics, which already at that time was not controversial, namely the assumption that the diagonal element of a matrix represents the time average of the corresponding physical variable in the stationary state considered. Therefore I discussed a system consisting of two atoms in resonance. The energy difference between two specified consecutive stationary states was assumed to be equal in the two 4 Uncertainty Principle and Foundations of Quantum Mechanics atoms so that for the same total energy the first atom could be in the upper and the second in the lower state or vice versa. If the interaction between the two atoms is very small one should expect that the energy goes slowly forth and back between the two atoms. In this case it can easily be decided whether the energy of one of the atoms goes continuously from the upper to the lower state and back again or discontinuously by means of sudden quantum jumps. If E is the energy of this one atom then the mean square of fluctuations A£ is quite different in the two cases [equation (1)]. The calculation does not require more than the non -controversial assumption of matrix mechanics mentioned above. The result decided clearly in favour of the quantum jumps and against the continuous change. ?2 AE 2 = (E-E) 2 -- ■■E*-E* (1) The success of this calculation seemed to indicate, that the non-controversial part of the interpretation of quantum mechanics should already determine uniquely the complete interpretation of the mathematical scheme, and I was convinced that there was no room left for any new assumptions in the interpretation. In fact, in the example mentioned above the square of the elements of that matrix, which transformed from the state where the total energy of the system was diagonal to the state where the energy of the one atom was diagonal, had to be considered as the corresponding probability. In the autumn of 1926 Dirac and Jordan formulated the theory of those general linear transformations which corresponded to the canonical transformations of classi- cal mechanics and which nowadays are called the unitary transformations in Hilbert space. These authors correctly interpreted the square of the elements of the transformation matrix as the corresponding probability; this was in line with Bom's older assumptions concerning the square of Schrodinger's wave function in configuration space and with the example of the resonating atoms. It was in fact the only assumption which was compatible with the old non- controversial part of the interpretation of quantum mechanics; so it seemed that the correct interpretation of the mathematical theory had finally been given. But was it really an interpretation, was the mathematical scheme a theory of the phenomena? In physics we observe phenomena in space and time; the theory should enable us, starting from the present observation, to predict the further development of the phenomenon concerned. But at this point the real difficulties started. We observe phenomena in space and time, not in configura- tion space or in Hilbert space. How can we translate the result of an observation into the mathematical scheme? E.g. we observe an electron in a cloud chamber moving in a certain direction with a certain velocity; how should this fact be expressed in the mathematical language of quantum mechanics? The answer to this question was not known at the end of 1926. For some time Schrodinger had discussed the possibility, that a wave packet obeying his wave equation could represent an electron. But as a rule a wave packet spreads out so that after some time it may be extended over a volume Heisenberg 5 much bigger than that of the electron. In nature, however, an electron remains an electron; so this interpretation would not do. Schrodinger pointed out, that in one special case, the harmonic oscillator, the wave packet did not spread; but this property had to do with the special fact, that for the harmonic oscillator the frequency does not depend on the amplitude. On the other hand there could be no doubt that de Broglie's and Schrodinger's picture of the three-dimensional matter waves did contain some truth. In the many discussions we had in Copenhagen during the months after Schrodinger's visit it was primarily Bohr who emphasized this point again and again. But what does this term 'some truth' mean? We had already too many statements which contained 'some truth'. We could, for example, compare the statements: 'The electron moves in an orbit around the nucleus.' 'The electron moves on a visible path through the cloud chamber.' 'The electron source emits a matter wave which can produce interferences in crystals like a light wave.' Each of these statements seemed to be partly true and partly not true, and certainly they did not fit together. We got the definite impression that the language we used for the description of the phenomena was not quite adequate. At the same time we saw that at least in some experiments such concepts as position or velocity of the electron, wavelength, energy had a precise meaning, their counterpart in nature could be measured very accurately. It turned out that for a well defined experimental situation we finally always arrived at the same prediction, though Bohr preferred to play between the particle- and wave-picture while I tried to use the mathematical scheme and its probabilistic interpretation. Still we were not able to get complete clarity; but we understood that the 'well defined experimental situation' somehow played an important role in the prediction. In the beginning of 1927 I was for some weeks alone in Copenhagen, Bohr had gone to Norway for a skiing holiday. In this time I concentrated all my efforts on the question: How can the path of an electron in a cloud chamber be represented in the mathematical scheme of quantum mechanics? In the despair about the futility of my attempts I remembered a discussion with Einstein and his remark: 'it is the theory which decides what can be observed'. Therefore I tried to turn around the question. Is it perhaps true that only such situations occur in nature or in the experiments which can be represented in the mathematical scheme of quantum mechanics? That meant: there was not a real path of the electron in the cloud chamber. There was a sequence of water droplets. Each droplet determined inaccurately the position of the electron, and the velocity could be deduced inaccurately from the sequence of droplets. Such a situation could actually be represented in the mathematical scheme; the calculation gave a lower limit for the product of the inaccuracies of position and momentum. It remained to be demonstrated that the result of any well defined observa- tion would obey this relation of uncertainty. Many experiments were discussed, and Bohr again used successfully the two pictures, wave- and particle-picture, in the analysis. The results confirmed the validity of the relations of 6 Uncertainty Principle and Foundations of Quantum Mechanics uncertainty; but in some way this outcome could be considered as trivial. Because if the process of observation itself is subject to the laws of quantum theory, it must be possible to represent its result in the mathematical scheme of this theory. But these discussions demonstrated at least that the way in which quantum theory was used in the analysis of the observations, was completely compatible with the mathematical scheme. The main point in this new interpretation of quantum theory was the limitation in the applicability of the classical concepts. This limitation is in fact general and well defined; it applies to concepts of the particle picture, like position, velocity, energy, as well as to concepts of the wave picture like amplitude, wave length, density. In this connection it was very satisfactory that somewhat later Jordan, Klein and Wigner were able to show that Schrodinger's three-dimensional wave picture could also be subject to the process of quanti- zation and was then — and only then — mathematically equivalent to quantum mechanics. The flexibility of the mathematical scheme illustrated Bohr's concept of complementarity. By this term 'complementarity' Bohr intended to characterize the fact that the same phenomenon can sometimes be described by very different, possibly even contradictory pictures, which are complementary in the sense that both pictures are necessary if the 'quantum' character of the phenomenon shall be made visible. The contradictions disappear when the limitation in the concepts are taken properly into account. So we spoke about the complementarity between wave picture and particle picture, or between the concepts of position and velocity. In later literature, there have been attempts to give a very precise meaning to this concept of complementarity. But it is at least not in the spirit of our discussions in the Copenhagen of 1927 if the unavoidable lack of precision in our language shall be described with extreme precision. There have been other attempts to replace the traditional language of physics with its classical concepts for the description of the phenomena, by a new language which should be better adapted to the mathematical formalism of quantum theory. But the development of language is a historical process, and artificial languages like Esperanto have never been very successful. Actually, during the past 50 years, physicists have preferred to use the traditional language in describing their experiments with the precaution that the limita- tions given by the relations of uncertainty should always be kept in mind. A more precise language has not been developed, and it is in fact not needed, since there seems to be general agreement about the conclusions and predic- tions drawn from any given experiment in this field. In Praise of Uncertainty GORDON REECE Imperial College, London i 1. THE PSYCHOLOGICAL BASIS OF OUR NEED FOR CERTAINTY The first post-natal experiences of a human being are necessarily associated with learning about the world in which he or she lives. Ideally, his emotional needs will be satisfied in much the same way as his physical requirements. Indeed, these various aspects are inextricably intertwined, centring on the mother's breast, which supplies at once food, warmth, reassurance and com- panionship. From the point of view of a very young baby, the idea of contentment cannot be separated from his confidence in the consistency and reliability of the world as he sees it. For him, happiness means the certainty that his food will arrive when he needs it, at the correct temperature and of a reliable composition. Later he becomes aware of non-animate objects, some of which fail to interact with him (passive objects like floors and walls), while others (like mattresses, blankets and rattles) respond when pushed or shaken. Gradually, a baby builds up a library of objects in which he can have confidence. Floors can safely be crawled on; thin air cannot. Walls can be bumped without apparent damage (to the walls) while balls and bottles roll away when pushed. He learns to categorize the objects around him. Fine gradations are learned from the varying degrees of, for example, softness of floor coverings, and intensities of light, noise and warmth. None of these distinctions, however, rivals the fundamental importance of simple 'yes/no' questions such as 'Am I hungry?' or 'Am I wet?' It is not until a baby is much older — say a year, when his feelings about the world will already have begun to gel — that he begins to confuse the issue with questions like 'Am I very hungry?' The real source of the baby's confidence in the external world is the certainty that if something is wrong it will be remedied. Uncertainty ('Where is Mum- my?', 'Where am I?' or 'Why am I still hungry?') represents insecurity, a loss of confidence in the external world and consequent unhappiness. The baby's confidence relies also on a belief in causality: 'If I cry, then Mummy will come', 'If I get milk, then I shall no longer be hungry', and the action of crying represents this reliance. 8 Uncertainty Principle and Foundations of Quantum Mechanics Eventually the child acquires the verbal skills to express his feelings, and to extend them by asking questions. No-one who has lived under the ceaseless questioning of a normal four-year-old child can have failed to detect consistent trends in the style of interrogation. Typically, one is asked questions like: 'Why can't we live upside down?' . . . followed by 'How long can you stand on your head?', 'Why can you stand on your head longer than me? Because you're older than me?' Such questioning is aimed at imposing a simple logical structure on seemingly haphazard phenomena. The 'simplest' structures, for this purpose, are perfect correlations amounting to causal relationships (age correlated with ability to stand on one's head). Each set of phenomena is dealt with more or less in isolation. Thus it is unusual for a child to follow a set of questions like these with, say, the question 'Why does being older make you better at standing on your head?' Some causal mechanism is taken for granted, and the exact details are not necessarily of interest. Far more likely is the catch question: 'Then why can't Granny stand on her head longest? She's older still.' Already, however, the child has sufficient confidence in causality that he will tend to dismiss odd exceptions to general rules, whether these rules are ones that he has thought up for himself or generally accepted truths, such as 'The older you are the wiser and taller you are'. It takes a lot of dwarves and imbeciles to convince a child that this is not always so. And the realization that Granny, despite her age, can no longer stand on her head does not appear to have even the briefest effect on the child's confident quest for definite causal connections. It is of little or no relevance to what extent the child's desire to impose a logical structure on the external world is in some sense innate, and to what extent it is a function of his upbringing. The only point of real significance is the universality of this desire, and its intensity. If it is a consequence of the direction of the child's thinking by the external world, and in particular by the adults in that world, this is a remarkable tribute to our ability to mould children in our image. There is, however, little evidence for such a view: it seems much more likely that we are born with a hefty predisposition towards a belief in causality and a desire for certainty. The most compelling evidence for this latter view is the fact that we can interpret the behaviour of animals in much the same way as we interpret the behaviour of human beings. We do not find any lack of 'logic' in the behaviour of chimpanzees, snakes or even amoebae. We do not need a special vocabulary to describe the intelligence of animals: indeed it is standard practice to use the behaviour of animals to help us understand ourselves. We assume that the same analysis as we know to be valid for human behaviour will give correct results when applied to other creatures: we are of course imposing our own preconceptions on their behaviour. Such methods have so far justified themselves by producing self-consistent results. As time passes, the child grows up, matures and begins to 'think for himself. It is an attribute of intelligent adolescents that they tend to question accepted values 'for the sake of it'. Unfortunately they have by this time lost their desire to challenge really fundamental 'truths' and their doubting has begun to take place within a well-defined framework of accepted authority and standard Reece 9 techniques. Above all, they have the confidence that all questions have answers, and that most questions have exact answers. It is a mark of true maturity to be able to function in the absence of certainty: for example, the ability to be 'good' without the certainty of ultimate retribu- tion for one's wickedness. It is much easier to search for minor deviations from an accepted truth, and to suggest the appropriate minor modifications, than to search freely for correlations and to discover one's own truths. In practice, it is also a good deal slower and less efficient: hence the popularity of ready-made orthodoxies of every kind. 2. THE HISTORY OF CERTAINTY It is clear that the desire for certainty and the belief in causality are not restrictions upon human thought imposed by the peculiar requirements of the external world but vice versa. In other words we can answer Eddington's disturbing question 'How much do our theories tell us about Nature, and how much do we contribute ourselves?' as follows: the very notion of causality and the desire for certainty are imposed by us on Nature. To pursue a metaphor due to Eddington, we are inclined to trawl the data of physics with a causal net. Small wonder, then, that we turn up just what we hope to find. For example, we tend still to use the vocabulary and methodology of classical physics when dealing with the phenomena of a submicroscopic world. We search for macro- scopic analogues, such as Bohr's atom, and set them up with such plausibility that they inevitably become obstacles to the further understanding of the very phenomena they purport to illuminate. Precisely because they are easy to understand in themselves the analogues tend to take on a life of their own. The first job of each succeeding generation of physicists is then to demolish the simplifications of their predecessors. The lay world, its representatives in the scientific establishment, and in times past even the Church, have all naturally thrown their weight behind conventional wisdom. The consequent emphasis on the destruction of bad old theories rather than the untrammelled construction of good new ones has hindered the development of science. In particular it has slowed down the process of acceptance of new theories by making them seem far more revolutionary than they really are. As with so many of the implicit 'values' of science, that of precision can be attributed to the Ancient Greeks. Their preoccupation with, for example, the problem of 'commensurability' is best explained in terms of a feeling on their part that commensurable quantities (rational numbers) were 'good' and irra- tionals were 'bad'. The alternative possibility, that they found irrationals too difficult to handle, is not particularly plausible, since most of the theorems proved in Greek mathematics for rationals hold equally for irrationals. Thus Archimedes established the rule for balancing weights on a lever for commen- surable ratios of weights, though the proof he gave did not of course require such an assumption. When the Pythagoreans proved the existence of irration- als, the notion of approximation was drawn into the vocabulary of physics. 10 Uncertainty Principle and Foundations of Quantum Mechanics Once established, the Greek attitude to accuracy remained unchallenged for 2,000 years. During that time, Christian civilization had imposed religious standards of 'truth' on science. By associating scientific truth with religious dogma, the Church unwittingly gave science a new importance. That impor- tance is still with us today: it stems from the need to establish new scientific theories beyond reasonable doubt before they could safely be taught. Once accepted, though, a theory wore a 'seal of approval', and could not easily be dislodged. One dogma of science in which religion has a more than usually large stake is the idea of causality. If we do not need to seek a cause for apparently inexplicable events — such as the existence of the Universe — we do not need to turn to religion for the explanation. Moreover, once it is admitted that there are questions which not only need not but actually cannot be answered — such as 'Where is that electron and how fast is it going?', 'What is the opposite of giraffe?' or 'Why did God create the world?' — it soon becomes evident there are whole realms of human experience which will continue to defy a simple, precise causal analysis. Plato's system of ideals, those 'absolute objects which cannot be seen other than by thought,' still underlies our attitude to mathematics and physics. We tend to think of real objects as imperfect ideal ones. Though no-one has ever found a perfectly smooth plane or an inviscid fluid, the theory of motion of solids and liquids treats friction and viscosity as unfortunate aberrations and irritations. It is more than 90 years since Rayleigh showed (Rayleigh, 1892) that the theory of viscous flow did not in general reduce to inviscid theory as the viscosity tended to zero: modern undergraduate mathematics has yet to acknowledge Rayleigh's discovery. It was not in fact until the late eighteenth and early nineteenth centuries that scientists felt free to challenge orthodox theories of precision. The most difficult to swallow of all the assertions of the Greeks was the parallel axiom of Euclid. In an age of rationalism it seemed only natural to put it to the test. Gauss took his instruments and set out to establish the truth of the parallel axiom by the only method he knew — that of direct measurement. The incon- clusiveness of his results opened up the road to non-Euclidean geometry. The notion that the angles of a triangle might add up to about 180 degrees rather than exactly 180 degrees was capable of overturning the whole edifice of certainty on which physics seemed to be built. The irrational numbers and the transcendentals could be treated as exceptional oddities, but a non-Euclidean world would make uncertainty pervade every aspect of physics. Laplace is commonly credited with having conjured up a demon capable of predicting the course of all subsequent events, given complete information on every particle in the universe at any one instant. Conventionally, this is regarded as the embodiment of rationalist overconfidence. But this can be read in precisely the opposite way: even if the world were purely causal it would require an impossibly well-informed demon to make proper use of the fact. Consequently, the world will necessarily seem unpredictable to us. Because we Reece 11 cannot hope to obtain the necessary information, we must resign ourselves to an inability to predict the vast majority of phenomena. Thus, although this may not be a theoretical limitation, in practice it introduces a new level of uncer- tainty. The laws of thermodynamics, likewise first properly formulated at the beginning of the nineteenth century, tell us more about what we can not know or do. In particular, they rule out the possibility of a perfect machine, and hence of perpetual motion. Thus another aspect of perfection had to be abandoned. 3. THE ACCEPTANCE OF UNCERTAINTY Gradually, therefore, imperfection, inaccuracy, unpredictability, uncertainty and randomness were accepted into physics. It is reasonable to relate this increasing tolerance with the growing maturity of science. By analogy with the growth of sophistication in the human being, we see that the history of science is the story of the realization that the world is not so simple as we should like it to be, that we cannot hope to achieve absolute certainty, and that we cannot hope to know or understand everything. Nor is it necessary to 'explain' everything that we do not understand, as the manifestation of some supernatural force, simply because we do not understand it. Belief in such dogmas is arrogance thinly disguised as humility. True humility in science consists in knowing that we do not know. The culmination of the acceptance of uncertainty came in the decade between 1925 and 1935. 1926-7 saw publications by Heisenberg identifying the inherent uncertainty associated with certain measurements. In 1931 Goedel published his Ueber formal unentscheidbare Saetze der Principia Mathematica, showing that the axiomatic method itself had inherent limita- tions. In 1934 Popper published the Logik der Forschung, which showed that the nature of a scientific hypothesis required its falsifiability, and which finally demolished the notion of absolute scientific proof (or disproof). It was thus doubt and scepticism that distinguished the scientist, and not confidence and certainty. It is clear that the awareness of the fallibility of the tools they use has made scientists much more careful in the way they derive and present their results. For example, the interaction between observer and phenomenon is recognized as a crucial factor in any sociological investigation. Indeed, in all experimental work involving living creatures, the effect of the experiment itself — the pres- ence of experimenters and their measuring instruments — on the outcome is now recognized. Logically, the next step should be to review our approach to the publication of the results of scientific investigations. Success is essentially trivial: it is failure to detect a satisfactory, simple causal explanation of a phenomenon that stimulates speculation. Currently, scientific journals concentrate on the essen- tial work of cataloguing success. How much more exciting would be the 12 Uncertainty Principle and Foundations of Quantum Mechanics publication of phenomena that defy correlation. It is 'anomalies' (such as that of the motion of the planet Mercury) that point the way out of inadequate theories and into the excitement of new fields. It is easy to point to analogies between the position of quantum mechanics in 1926, and that of fundamental particle physics in 1976. It may well be that in order to resolve their current dilemma physicists may once again have to think the unthinkable, and challenge the very foundations of their subject. REFERENCE Strutt, J. W. [3rd Baron Rayleigh] ( 1 892) 'On the question of the stability of the flow of fluids', Phil. Mag., 34, 59-70. On the Meaning of the Time-Energy Uncertainty Relation JERZY RAYSKI and JACEK M. RAYSKI, JR. Jagiellononian University, Cracow, Poland Soon after the formulation of the usual uncertainty relations between Car- tesian coordinates of particles and their momenta A* • Ap x — Ay • Ap y — Az • Ap z = h (1) there appeared the problem of the existence and meaning of a similar relation between time and energy AfAE=*h (2) constituting a natural extension of the three relations [given in equation (1)] of Heisenberg from the point of view of the special theory of relativity. Refer- ences to the energy-time uncertainty problems may be found discussed, for example, by Carruthers and Nieto (1968). The relation (2) is not derivable from the formalism of quantum mechanics in the same way as relations (1) were derived, neither can the interpretation of relation (2) be quite analogous to the ordinary interpretation of (1). First of all, in contradistinction to the coordinates x, y, z, the time variable t is not an operator associated with an observable characterizing the particle but is a universal parameter. Moreover, energy is not a generalized momentum canonically conjugate to the time variable t in the usual sense of this word. In consequence of the fact that t is not an observable but a parameter, the opinion of most physicists is not favourable towards the possibility of regarding the operator iti(d/dt) as the operator of energy, and is against the suggestion of relating the non-commutability of ihid/dt) and t with an impossibility of their 'simultaneous' determination and, consequently, of an appearance of the uncertainty relation (2). Incidentally, it is not at all clear what is meant by a 'simultaneous determination' of t and of any other physical quantity. Any measurement of a physical quantity at a given instant of time means a simultaneous determination of both this quantity and the time at that instant. For the sake of completeness it should be mentioned that time and energy do form a pair: a generalized coordinate and its canonically conjugated momen- tum within the framework of the so-called homogeneous canonical formalism, 13 14 Uncertainty Principle and Foundations oi Quantum Mechanics but this formalism does not constitute a basis and starting point for quantiza- tion. The latter is'performed with the help of the ordinary canonical formalism where the energy, i.e. the Hamiltonian, is not to be regarded as one more of the generalized coordinates or momenta. It was argued that the Hamiltonian H rather than the operator ih(d/dt) plays the role of the operator of energy and— in order to formulate the fourth uncertainty relation— one should look for an operator which would play the role of a generalized coordinate, canonically conjugate to the Hamiltonian (taken to be a generalized momentum) or, vice versa, to look for a generalized momentum canonically conjugate to the energy (the latter regarded to be a generalized coordinate). But there are serious difficulties with defining such an operator. To show some of them let us limit our considerations to one space-like dimension x. The Hamiltonian for a free particle in the non- relativistic quantum mechanics is p 2 /2m, and the (formal!) operator f satisfying the relation (3) is [H,t] = -ih t = ^ + l x ) + f ix) ~ 2\ p v I (4) where/(x) is an arbitrary function of x. (There is a correspondence between the operator (4) and the classical time if one puts/(x) = and recalls the fact that in classical physics p = mv and for a free particle v = x/t.) However, the trouble is that the operator (4) is not well defined because the inverse of the operator p does not exist inasmuch as the spectrum of p includes the value zero. Thus, the domains of definition of the operators H and t given by (4) are not the same. Consequently, the operator H cannot play the role of a generalized coordinate correlated with a canonically conjugate momentum and it is impossible to derive the fourth uncertainty relation in an analogous way to that which was used to prove the usual three relations of Heisenberg. Recently Eberly and Singh (1973) claimed to have circumvented this diffi- culty by constructing a reciprocal time-operator. However, their determination of the fourth uncertainty relation has been achieved in a very round-about way so that we shall not present it here. In what follows, we will present another, very straightforward and direct derivation of this relation. Not only a derivation of relation (2) but also its interpretation in a way which is closely analogous to the interpretation of the usual relations (1) seems to be impossible. In fact, according to quantum mechanics, if the energy spectrum is discrete we may construct a stationary solution of the Schrodinger equation describing the system in an eigenstate of energy. In this case energy is exactly known for any time instant determined with an arbitrarily high precision Ar < e. But also in the case of a continuous energy spectrum it is possible to construct a solution so that the energy is determined up to an arbitrarily small uncertainty AE<e, and this solution remains almost stationary for a very long time Rayski and Rayski Jr. 15 interval i.e. energy is known almost exactly for any instant (determined with an arbitrarily high precision) within a long time interval. This contradicts sharply a naive interpretation of formula (2) according to which At means uncertainty of the time instant at which the particle possessed an energy E known within the limits of inaccuracy AE. One could look for an excuse and explanation of the appearance of the above-mentioned difficulties with the uncertainty relation (2) in the fact that the ordinary quantum mechanics is a non-relativistic theory. Being in disaccord with the requirements of relativity quantum mechanics is treating the time coordinate on a different footing as compared with the three space-like coordinates of the particles. This may constitute a reason why relation (2) does not hold true in the ordinary quantum-mechanical description of physical phenomena. On the other hand, relation (2) is known to be satisfied in quite another context, viz. as a relation between the uncertainty of energy and the mean lifetime of unstable particles. But unstable particles are not described satisfactorily within the framework of quantum mechanics. They may be described consistently only within the framework of quantum field theory where the number of particles is observable which does not need to be a constant of motion. As is well known, it is quantum field theory but not quantum mechanics that may be truly reconciled with relativity and so the accord of relation (2) with quantum field theory as well as the disaccord with quantum mechanics seem to be explicable. The above excuse for the appearance of serious difficulties with the problem of relation (2) in quantum mechanics is not convincing. The ordinary quantum mechanics of a single particle may be regarded as a limiting case of quantum field theory in the low-energy region and in the subspace of one-particle states (as the number of massive particles becomes constant in the low-energy limit). Thus, if relation (2) holds true (in a certain sense) in quantum field theory, it should also remain valid in the above-mentioned limit. Indeed, relation (2) does not involve the magnitude of the mean value of energy and should be valid also in the low-energy limit. We may present still another argument against the view that the non- relativistic form of the ordinary quantum mechanics is to be blamed for the difficulties appearing in connection with the problem of relation (2). It is a common feature of relativistic theories that 'fourth' relations (completing some three-dimensional relations known from the pre-relativistic physics) are often only formally analogous to their three-dimensional counterparts whereas their meaning and interpretation are different. Let us illustrate this statement with an example: In the relativistic extension of Newtonian mechanics there appears a fourth equation of motion of a point particle, formally quite similar to the ordinary three equations. But its physical and even mathematical sense is quite different: it does not introduce any new degree of freedom, does not increase the number of independent equations of motion because it is dependent upon the usual three equations of motion and relates the energy change to the work, i.e. expresses the law of conservation of energy. In the non-relativistic limit this 16 Uncertainty Principle and Foundations of Quantum Mechanics fourth relation does not disappear because energy is conserved also in the non-relativistic dynamics. Similarly, if in a relativist* theory the existence of a fourth relation (2) is to be expected, this relation should appear also in the limiting case of a non-relativistic theory, although its motivation and its physical meaning do not need to be similar to those of the usual relations (1) In order to show that the uncertainty relations (1) as well as (2) must hold true it is not necessary to invest the whole machinery of quantum mechanics but one may limit oneself to a consideration of de Broglie wave packets Let us consider a function f(x) in R t and define the dispersion (uncertainty) of x with respect to /(x) in the usual way Ax = (/,[x-(/,x/)] 2 /) 1/2 (5) where (f,A-f) = \dxf*Af Let us consider the Fourier transform of the function /(x) 1 r|dx/(x^"" g(fc) = (2ir) T72 (x)e" (6) (7) A well-known mathematical theorem (see e.g. Heisenberg, 1930) says that the minimum of the product of dispersions Ax • Afc is obtained if fix) is of a Gaussian shape /(x) = 7 A 7 5exp(-x 2 /(Ax) 2 ) Then also g(fc) is of a Gaussian form 1 g(fc) = 7T J a72exp(-fc 2 /(Afc) 2 ) (ITT) (8) (9) where Ax and Afc appearing in (8) and (9) are identical with the dispersions defined according to (5). Moreover, their product is shown to be Ax- Afc =4 < 10 > This is a mathematical fact, quite independent of the meaning of the variables x and fc. By identifying x with a Cartesian coordinate of a particle in the ordinary space and fc with the inverse of de Broglie's wave length divided by 2tt, so that fc = 2tt\' 1 one gets the usual uncertainty relation between the Cartesian coordinate and momentum (expressed in such units that h - 1). But we may as well replace x by t and fc by the frequency <o which yields Af-Aw=5 or At-AE = - ( n ) Thus, the fourth uncertainty relation (2), or more precisely (11), is a direct consequence of the wave aspect of matter. Rayski and Rayski Jr. 17 While the existence of relation (11) is beyond any doubt the problem of its interpretation still remains open. Let us stress once more that the following interpretation : 'the information about the value E of the energy of a particle and the information about the instant t at which it possessed this amount of energy are incompatible unless both informations are subject to uncertainties AE and A/ whose product is not smaller than 'ft/2' is incorrect. In order to find out a correct interpretation of the fourth uncertainty relation let us come back once more to a discussion of the ordinary uncertainty relations between position and momentum. In this case it is also incorrect to say simply that these relations mean an impossibility of surpassing the exactitude of information about momentum and position of a particle beyond the limits imposed by the formulae (1). This last statement is not correct because one can measure the position of a particle first (say at t t ) with an arbitrary exactitude and afterwards (say at t 2 ) measure its momentum also with an arbitrarily high precision so that in the interval (r x , t 2 ) bounded by the two instants of measurements the exactitude of our information about position and momentum surpasses, indeed, the limits imposed by the relations (1). Heisenberg's uncertainty relations, if correctly understood, mean something else, namely the following two facts: (a) A simultaneous direct measurement of coordinate and momentum of a particle with an exactitude surpassing the limits (1) is impossible, (b) If the two measurements were performed consecutively then only the result of the latter may be used for probabilistic predictions of the future while the result of the former measurement becomes completely disac- tualized* and invalidated due to the uncontrollable disturbance of the particle by the latter measurement. Thus, the point of utmost importance as regards the correct interpretation of (1) is that it determines the limits for an accuracy of simultaneous (i.e. at a fixed instant t ) measurements of x and p and, consequently, for a maximal precision of prescribing the initial values of the parameters of the system that are necessary for the computation of its temporal development. Substituting x by t and p by E we also must not forget to perform suitable substitutions in the interpretational comments: Exactly as (1) is valid for a fixed value t = t , the relation (2) must be valid for a fixed value x = x , otherwise the analogy of the two uncertainty relations (in a two-dimensional space-time) would be incomplete and would lead us astray. But what does it mean that the relation (2) applies to a fixed point x = x ? Obviously, it means that if one is observing the particle (represented by a wave packet with a given AE, to pass the point x = x during its propagation along the x-axis then one is unable to say when it will pass the point x = x with an exactitude greater than Af = h/ AE. The more exact is the knowledge of the particle energy the less exact is the time instant of passing (of this particle) by a fixed point on the x-axis and vice versa: the more exactly we know the instant at which a particle passed by an arbitrary out fixed point (on the x axis) the less exact must be our knowledge about its t remains valid for probabilistic retrodictions of the past. (See J. Rayski, 1973). 18 Uncertainty Principle and Foundations of Quantum Mechanics energy. Such is the proper sense of the time-energy uncertainty relation for a free particle in a two-dimensional space-time. To our knowledge, such interpreta- tion has not been stated explicitly in any of the extremely numerous scientific articles and textbooks on quantum theory. Going over from two- to a three-dimensional space-time the fixed point turns over into a fixed line, and going over to a four-dimensional space-time it becomes a surface. In this case the fourth uncertainty relation may be inter- preted as follows: At means the uncertainty of the instant when the particle will cross this surface. The product of this uncertainty At and the uncertainty of energy AE cannot be smaller than jh. The above-mentioned surface may be a closed surface constituting the boundary of a three-dimensional domain whose volume V may be assumed to be finite, and we may ask about the instant when a particle will cross this surface and enter the domain in question. Again, the knowledge of the instant of crossing this boundary by an ingoing particle cannot be made certain beyond the exactitude imposed by the uncertainty relation (2). The question about ingoing particles which enter a given domain by passing from its exterior into its interior across its surface is a problem of boundary conditions. In quantum mechanics one usually considers either wave functions in the whole space or in a finite domain but with non-penetrable walls. In neither case does the problem of how many particles and when, enter or leave the domain in question appear. But it is a very natural problem to consider a finite domain in space and to ask for a solution of the Schrodinger equation in this domain under given initial conditions (say at f = 0) and under some boundary conditions determining the ingoing waves, i.e. the waves crossing the surface of the domain into its interior (for t > 0). Such mixed boundary-initial conditions determine uniquely the solution in this domain f or t > and enable one to compute the outgoing waves crossing the surface of the domain from its interior to its exterior. This is the most natural approach to a description of scattering phenomena occurring in a finite domain. Now, whereas the initial conditions at t = have to be consistent with the ordinary uncertainty relations between coordinates and momenta, the boundary conditions for f >0 must be consistent with the time-energy uncer- tainty relation : The knowledge of when an ingoing particle enters the domain in question and the knowledge of its energy are subject to uncertainties satisfying the relation (11). In conclusion it may be stated that the ordinary uncertainty relations are related to the initial value problems at a space-like hypersurf ace whereas the energy-time uncertainty relation is connected with the boundary problems on closed time-like hypersurfaces, e.g. abstract (i.e. freely penetrable) walls restricting a finite domain during a finite or infinite time interval. Hitherto in our discussion we tacitly assumed wave packets describing free particles. This fact reminds us of an objection raised by Eberly and coworkers (1973) in a footnote to their article. We quote: 'The conventional understand- ing is essentially dichotomous. That is, the uncertainty times associated with Rayski and Rayski Jr. 19 wave packet spreading and with excited-state decay are regarded as unrelated consequences of the uncertainty principle. This point of view is apparent in every quantum mechanics text known to the authors.' The question arises whether this objection applies also to our understanding of the energy-time uncertainty relation. The first reason for the appearance of this dichotomy is simply the fact that quantum mechanics is principally unable to describe unstable systems. There- fore the relation (2) applied to the lifetimes of unstable systems and the uncertainty of their rest masses can be applied only to quantum field theory where the numbers of particles are not constants of motion. But assuming quantum field theory we may ask the following question: How can we know that an unstable particle has undergone a decay? Obviously by surrounding a macroscopic domain D in the interior of which the particle is situated by detectors in order to register the decay fragments outgoing from the domain through its boundary (equipped with detectors). But this is just a particular case of the above described boundary-initial problems: At the initial instant t we assume the presence in the domain D of an unstable system characterized by an uncertainty of energy AE. As for the boundary condition we assume that no particles will penetrate into the domain from its exterior at t > t . We look for outgoing waves of the decay fragments. According to our previous discussion the time of crossing the boundary by the decay fragments must remain uncertain within At — h/AE. But the uncertainty of the instant of escaping from the domain is related to a similar uncertainty At of the decay instant. Thus, the time of decay counted from an arbitrary initial time instant t (when the system was known to be still a bound state) could be anything within the interval (t , t + At). Consequently, the mean lifetime of the system is something like one half of At and the product At • AE is, indeed, of the order of magnitude of Planck's constant. Let us remark that the boundary condition consisting of an assumption that no particles enter the domain for t > t was necessary because otherwise we would have to deal with an induced decay which might affect considerably the mean lifetime of the unstable system. From the above discussion it is obvious that the two uncertainties: one connected with the spreading out of wave packets representing free stable particles and the other connected with the problem of the lifetime of unstable systems are not dichotomous and the objection of Eberly and Singh does not apply to our interpretation of the energy-time uncertainty relation. Our explanation of the fourth uncertainty relation may be summarized as follows: if there existed a well defined 'time-operator' canonically conjugate to the Hamiltonian then the fourth uncertainty relation would be independent of the usual ones. But it is not the case. Time does not need to be and, in fact, is not an operator but a mere parameter. However, similarly as the fourth equation of Newton in relativistic mechanics is not independent from the remaining three equations but is a consequence of them, also the fourth uncertainty relation exists and is a straightforward consequence of the remaining three relations 20 The Uncertainty Principle and Foundation of Quantum Mechanics and the wave character of particles. The proper interpretation of the energy- time relation is connected with the problem of boundary conditions in quite an analogous way to that in which the usual uncertainty relations are connected with the problem of initial conditions. In particular, the uncertainty At is related to the problem of when a particle is penetrating across a given surface. REFERENCES Carruthens, P. and Nieto, N. M. (1968) Rev. Mod. Phys., 40, 411. Eberly, J. and Singh, L. P. S. (1973) 'Time operators, partial stationarity, and the energy , Phys. Rev.D, 7, 359-362. . . . Heisenberg, W. (1930) Die Physikalischen Prinzipien der Quantentheone, S. Huzel, Leipzig. Rayski, J. (1973) "The possibility of a more realistic interpretation of quantum mechanics , Foundations of Phys., 3, 89-100. 4 A Time Operator and the Time-Energy Uncertainty Relation ERASMO RECAMI University of Catania, Italy 1. INTRODUCTION In nuclear physics and in elementary particle physics (at low energies) it is usual to have recourse only to monochromatic plane waves and to the time- independent formulation of quantum mechanics. With the aim of making quantum mechanics as 'realistic' as possible, let us on the contrary adopt a space-time description of the collision phenomena, by introducing wave packets. Notice that, even when dealing with many wave packets, it is not necessary at all to have recourse to unphysical, multidimen- sional spaces. On the contrary, if we want to preserve the individuality of the considered packets, we must just supply a temporal (realistic and physical) description of them within the ordinary, three-dimensional space. As soon as a space-time description of interactions has been accepted, one can immediately realize, even in the framework of the usual wave-packet formalism, (Olkhovsky and Recami, 1968, 1969) that a quantum operator for the observable time is operating. Namely, it is implicitly used for calculating the packet time-coordinate, the night-times, the interaction-durations, the mean- lifetimes of metastable states and so on (Recami, 1970; Olkhovsky and Recami, 1970; Baldo and Recami, 1969; Olkhovsky, 1967). A preliminary, heuristic inspection of the formalism (Olkhovsky and Recami, 1968, 1970) suggests the adoption of the following 'operators' (Olkhovsky and Recami, 1970; Baldo and Recami, 1969) f i = — *■ ae* ' 2 ~~2dE' [E=E «* ] (1) acting on a wave-packet space which we must carefully define [because of the differential character of the 'operators' (1)]. 21 22 Uncertainty Principle and Foundations of Quantum Mechanics 2. MATHEMATICAL INTRODUCTION Let us first consider, for simplicity, a free particle in the one -dimensional case, i.e. the packet: F(t, x)=\ dp- F(E, p) - exp [ i (px - Et)] ■"0 (2) where h= 1 and E=p 2 /m . The integral runs only over the positive values owing to the 'boundary' conditions imposed by the initial (source) and final (detector) experimental devices. Notice that, in so doing, we chose as the frame of reference that one in which source and detector are at rest: i.e. the laboratory reference frame. In particular, notice that we are considering for simplicity the case of source and detector at rest one with respect to the other. Let us now observe that the packet (average) position is always to be calculated at a fixed time t = i; analogously, the packet time-coordinate is always to be calculated (by suitably averaging over the packet) for a position x = x along a particular packet-propagation-ray. Therefore, in our case we can fix a particular x = x, and restrict ourselves to considering, instead the packets (2), the functions: F(t, x) = f °° dp • f'(p, x) • exp 1-iEi] = f AE • f(E, x) • exp [-iEt] Jo J (+) (3) where E - £ tot = E kin = p 2 / m ; and /'=/-d£/d|p|. Functions F(t,x) and f(E, x), being only functions either of t or of E, respectively, are neither wave functions (that satisfy any Schrodinger equation), nor do they represent states in the chronotopic or four-momentum spaces. Let us briefly set: F = F ( t ) = F(t, x); f=f(E)=f(E,x) (4a) It is easy to go from functions F, or/, back to the 'physical' wave packets, so that one gets a one-to-one correspondence between our functions and the 'physical states'. We shall respectively call «space f » and ((space £» the functional spaces of the F's and of the transformed functions/, with the mathematical conditions that we are going to specify. In those spaces, for example, the norms will be: |F|-f \F\ 2 df, ||yH-f|/| 2 d£ J-oo J o (4b) In any case, due to equations (3), the space t and the space E are representa- tions of the same abstract space P, where we indicate F+\F); f-*\f) (4c) where \F) = \f). For reasons which we shall see later, let us now specify what has previously been said by assuming that space P is the space of the continuous, Recami 23 differentiable, square-integrable functions / that satisfy the conditions: a/ 2 r r l/| 2 d£<oo; J o Jo \dE dE<oo ; j \f\ 2 E 2 dE«x> (5) Such a space is dense (von Neumann, 1932) in the Hilbert space of L 2 functions defined over the interval < E < oo. 3. DEFINITION OF THE TIME OPERATOR Still within the framework of the usual quantum mechanics with wave packets, let us define in the most natural way: <'(*)> = ^p(t,x)tdt Hop(f,.r)df ; P-I^l 2 (6) Then we can immediately calculate that (t(x)) = (F\t\F) = ±-\ [F*tF]dt iV J— oo where N is the normalization factor, and verify that whence: (F\t\F) = (f\-^\f) (7) (8) (9) This would suggest adopting as the time 'operator' the bilinear derivation "'2— {jjS UN By easy calculations, one realizes that we can also adopt the (standard) operator d t = u=-i dE (10b) even if at the price of imposing on space-P functions the subsidiary condition /(0, x) = 0, which is not fully desirable from a physical viewpoint. Since for using bilinear derivation (10a) as a (bilinear) operator a new formalism should be introduced (Olkhovsky and Recami, 1970), let us prefer here the time operator (10b). 24 Uncertainty Principle and Foundations of Quantum Mechanics 4. TIME-OPERATOR PROPERTIES Our operator (10b) has many good properties as listed below. (1). Equation (9) shows that, in the space t, it reduces— as is very natural— to the mere multiplication by t. (2). Relations such as equation (8) become physically clear when written: whence, in accordance with the Ehrenfest principle, it follows that: (t) = t +x/(v) (ID (3). (4). When we pass to a new frame of reference, source and detector will no more be at rest: However, only the packet properties relative to the detector (and to the source) will still be essential. This is enough to secure the Galilean invariance of our operator. In the impulse representation, one meets the interesting correspon- dence (ft = 1): i 8 nto\ 2dE**Y L P P J 2 im 2? (12) where the last addendum vanishes when ft-»0. (5). We have seen already that the space of the (continuous, differentiable) functions satisfying conditions (5) is dense in the Hilbert space of L functions defined over the interval 0<E <oo. Firstly equation (5) is the condition for square integrability. Secondly equation (5) requires that our operator (10) transform Hilbert-space vectors into Hilbert-space vectors. Thirdly equation (5) requires that in our space a 'good' energy- operator can be denned. It is easy to verify that our operator (10b) is canonically conjugated (Heisenberg, 1944) to the {total) energy: [f,E] = -ift (13) (6). Under conditions (5), one gets that: f7ff/ 2 dE=f (f/i)*/2dE Jo J o (14) i.e. that our time operator is not only Hermitian, but also symmetric, according to the usual mathematical terminology (Akhieser and Glads T man, 1954). Recami 25 (7). Having now the time operator (10b) at our disposal, we can immediately obtain — through the standard procedure (see, for example, Caldirola, 1966) — the uncertainty correlation: AE'At: ft '2 (15) In our opinion, equation (15) means that in general the uncertainty AE that one meets when measuring the energy £ of a particle is tied to the duration of the actual measurement interaction by relation (15). For example, let us suppose that we are measuring the energy of a particle by observing its track in a bubble chamber. If we examine (by means of a photograph) a long track segment, we will be able to have good 'statistics' in counting bubbles, and therefore a good determination of the (average) energy of the particle while producing that track; but the time instant at which the particle possessed that energy will be known with a large uncertainty. Vice versa, if we examine a short track segment, then we shall get a good time measure, but at the price of poor bubble-statistics (see Figure 1). In this example, the experiment — or better the measurement — is the track-segment examination. Figure 1 Track of particle in a bubble chamber 5. CASE OF POTENTIAL SCATTERING When passing to the non-free case, things do not essentially change. Let us consider, for example, the case of the scattering of a (spin free) particle by a central potential V(r). Inside the potential region, we have packets of partial /-waves, distorted by the potential (Calogero, 1967). By the introduction of S2 n functions ^ ,n) ('> V, ^° ut) ('> f), and of the transformed ones B? n \p, r), B i (p, r) (Olkhovsky, Recami and Gerasimchuk, 1974), the time durations are still got by using operator (10); and one will still write: (F.mKo^Ul-i-^lB) \ I 2dE\ / in)0ut Analogously, also equation (13) is still valid, and so on. (16) 26 Uncertainty Principle and Foundations of Quantum Mechanics In the particular case of metastable states (Olkhovsky and Recami, 1968; Olkhovsky, 1968; Recami, 1970), let us admit that V(f) m for f > R, quantity R being the potential radius (see Figure 2). Let us analyse the process: free initial flight; unstable state formation; and decay with subsequent free final flight. Let us calculate the time r, spent by the particle (or better by its / partial wave) inside a sphere with centre in the potential centre and with radius r > R. r> R Figure 2 The scattering of a particle by a central potential When in the presence of a resonant elastic scattering, we have: .E-Eo-iY - -- . T s, = s t 'E-Eo+iT' 5 ( -S/-arctg E-E (17) where S, and 8, are smoothly varying functions in the 'resonance' region. In the narrow resonance approximation, for sufficiently large values of r one obtains: r / -2r<t;- 1 )+- (18) Analogously, one can calculate the duration of the interaction (Olkhovsky and Recami, 1968) — or of partial interactions (Olkhovsky and Recami, 1969) — i n a two-wave packet collision (Olkhovsky, Sokolov and Zaychenko, 1969). In particular, it seems useful to calculate the interaction duration, <Ar) int , corresponding to the cross-section enhancements: the necessary condition for the peak to be associated with a true resonance will be that <Af ) int also has a maximum at the considered energy. 6. WHY A TIME OPERATOR WAS NOT INTRODUCED IN STANDARD QUANTUM MECHANICS After what we have seen of the good behaviour of our operator (10), we can ask ourselves why a time-operator was not introduced in standard quantum Recami 27 mechanics, even if quantum mechanics is typically built up by associating an operator to every observable. The reason is that operator (10), defined as acting on the space P, does not become hypermaximal (von Neumann, 1932), because of the fact that P is a space of functions defined only over the interval 0<E<oo and not over the whole E-axis. It follows that f, while being Hermitian and symmetric, is however not self-adjoint, and does not allow identity resolution. Essentially because of these reasons, Pauli (1958) objected to the use of a time-operator, and this had the effect of practically stopping studies on the subject. Von Neumann himself, however, had claimed — followed by other authors (e.g. Engelman and Fick, 1963, 1964, 1959; Razavy, 1969, 1967; Landau and Lifshitz, 1963; Aharonov and Bohm, 1961; Papp, 1971, 1972; Rosenbaum, 1969) — that considering in quantum mechanics only self-adjoint operators could be too restrictive. This is our conviction: In fact, even if operator i does not admit true eigenfunctions, nevertheless we succeeded in calculating the average values of i over our functions (and over the physical 'packets' corresponding to them). And that is enough for us. That is also the reason why, after equations (10), we have often written the bilinear form (10a) instead of the standard operator (10b). To clarify the problem, we shall quote an explanatory example (von Neumann, 1932): Let us consider a particle Q, free to move in a semispace bounded by a rigid wall (see Figure 3). We shall then have 0<jc <oo. Conse- quently, the impulse x -component of Q, which reads Px - —i — y dx (19) will be a non-hypermaximal, non-self-adjoint (but only Hermitian, symmetric) operator, even if it is an observable and has a simple physical meaning. '// I 0<x< oo a ,• d 'A Figure 3 A particle free to move in a semispace bounded by a rigid wall ACKNOWLEDGEMENTS The author acknowledges that the core of the .present matter was essentially developed in collaboration with Professor V. S. Olkhovsky, and he is also 28 Uncertainty Principle and Foundations of Quantum Mechanics grateful to Professor M. Toller for very useful criticism. His thanks are due to Dr. S. S. Chissick, Dr. A. I. Gerasimchuk and Dr. E. Papp for their very kind interest. REFERENCES Aharonbv, Y. and Bohm, D. (1961) Phys. Rev., 122, 1649. Akhieser, N. I. and Gladsman, I. M. (1954) Theorie der Uneaten Operatoren in Hilbert Raum, Akademie Verlag, Berlin. Baldo, M. and Recami, E. (1969) Lett. Nuovo Omenta, 2, 643. Caldirola, A. (1966) 'Istituzioni di Fisica Teorica', Ambrosiana, Milano. Calogero, F. (1967) Variable Phase Approach to Potential Scattering, Academic Press, New York. Engelman, F. and Fick, E. (1959) Supplem. Nuovo Omenta, 12, 63. Engelman, F. and Fick, E. (1963) Z. Phys., 175, 271. Engelman, F. and Fick. E. (1964) Z. Phys., 178, 551. Heisenberg, W. (1944) Die Physicalischen Prinzipien der Quantumtheorie, 4th ed., Hirzel, Leipzig. Landau, L. D. and Lifshitz, E. M. (1963) Kvantovaya Mekhanika, Nauka, Moscow. Olkhovsky, V. S. (1967) Nuovo Omenta, 48 B, 170. Olkhovsky, V. S. (1968) Ukr. Fis. Zh., 13, 143. Olkhovsky, V. S. ana Recami, E. (1968) Nuovo timenta, 53 A, 610. Olkhovsky, V. S. and Recami, E. (1969) Nuovo Omenta, 63 A, 814. Olkhovsky, V. S. and Recami, E. (1970) Lett. Nuovo Omenta, 4, 1165. Olkhovsky, V. S., Recami, E. and Gerasimchuk, A. I. (1974) Nuovo Omenta, 22 A, 263. Olkhovsky, V. S., Sokolov, L. S. and Zaychenko, A. K. (1969) Soviet J. Nucl. Phys., 9, 114. Papp, E. (1971) Nuovo Omenta, 5 B, 119. Papp, E. (1972) Nuovo Omenta, 10 B, 69, 471. Pauli, W. (1958) Handbook derPhysik, Fliigge, S. Ed., Vol. 5/1, p. 60 last ed., Springer- Verlag, Berlin. Razavy, M. (1967) Am. Joum. Phys., 35, 955. Razavy, M. (1969) Nuovo Omenta, 63 B. 271. Recami, E. (1970) Ace. Naz. Lincei, Rendic. Sc., 49, 77 (Rome). Rosenbaum, D. M. (1969) /. Math. Phys., 10, 1 127. Von Neumann, J. (1932) Matematischen Gnmladen der Quantum Mechanik, Hirzel, Leipzig. Quantum Theory of the Natural Space-Time Units ERHARDT W. R. PAPP Polytechnic Institute of Cluj, Romania 1. INTRODUCTION For more than 50 years the quantum-mechanical space— time description problem has aroused justified interest and has given rise to great power for insight. Overcoming difficulties, phycisists have investigated this subject ini- tially from certain points of view, and reinvestigated it subsequently with respect to a relatively more evolved context. The history of the space-time quantization represents in fact the most significant and profound aspect of the history of quantum theory itself. Throughout the years attempts have been made to analyse, though only provisionally, the peculiarities of a common quantum-mechanical description of space-time and matter, and space-time quantization has come to be regarded as one of the fundamental problems in the scientific understanding of nature. The conceptual new content of quantum mechanics is expressed by the explicit recognition that measurements cannot be objectively performed with indefinitely increasing accuracy. In these conditions we have to consider the existence of the ultimate (non-zero) accuracy of the space-time measurements. This ultimate accuracy principally results from the new role of the measuring apparatus as a physical object which is itself constituted from the really existing microparticles. Generally, the microparticles have to be considered neither as points, nor with a rigorously spatial extension. This assumption is supported by a certain structure of the physical microparticle and vice versa. Considering the microparticle coincidences as the elementary acts of the space-time measure- ments, there results the existence of an intrinsic space-time allowance (March, 1941). This allowance is able to offer by itself the possibility of defining — now in a natural way (Bohm and coworkers, 1970) — the existence of the natural space-time units. In this respect the quantum-mechanical space-time measur- ing process can be considered as the counting process of the successive elementary coincidences. Moreover, the structure of quantum mechanics as a proper physical theory, with a well-established form, can be generally deduced from the laws of the measurements (Ludwig, 1972). It now becomes necessary for the mathematical formalism of quantum mechanics to be explicitly in agreement with the existence of the non-zero 29 30 Uncertainty Principle and Foundations of Quantum Mechanics space-time imprecisions. For this purpose a suitably extended quantum- mechanical formalism is needed which has to contain the space-time impreci- sions as fundamental entities. Such a formalism has also to permit the consis- tent definition of the natural space-time units as certain lower bounds of the space-time imprecisions. In this sense account has to be taken of the existence of certain profoundness levels in the quantum-mechanical description of the microparticles: atoms, nuclei and elementary particles. One would then expect to obtain the Bohr radius, the Compton wavelength and the electron radius as the natural units for atoms, for free elementary particles and for (interacting) electrons, respectively. There is also the Planck radius which has to be the natural space constant of the gravitational field. Such a quantization programme can be materialized by a suitable use of the binary description formalism. In this sense the complex numbers (a - j'j8) — and not exclusively the real ones — are allowed to describe the results of the measurements (Kalnay and Toledo, 1967). The result of the measurement is now expressed — in a relatively more complete manner — by the pair (a, 0) of real numbers, and alternatively by the interval [a - 0, a + /3] (or (a - /3, a + 0)) on the real a axis. This segment is non-equivocally defined by the complex number. In this respect the space imprecision approach proposed by Flint (1948) is relatively 'incomplete' as he uses not an interval on the real axis, but only a translated point. The space-time imprecisions become now inner elements of the theory, as they are defined as the imaginary parts of the binary (non-Hermitian) operator averages (Papp, 1972a, b; 1973; 1974a, b, c). In such conditions Neumann's axiom which legitimizes the description of the physical observables only by hypermaximal operators is in fact not rejected, but extended (Fick and Engelmann, 1964; Olkhovsky, Recami and Gerasim- chuk, 1974). A short history of the space-time quantization problem will be presented in Section 2. The meaning of the binary space-time description will be analysed — in terms of the collision- time evaluations — in Section 3. In this way it is proved that the space-time imprecisions are able to express certain limitations on the accuracy of the space-time measurements. Section 4 is devoted to the defini- tion of the binary space-time operators and of the corresponding space and time imprecisions. There it is proved that the binary description of the space-time is mutually connected with the one of the action. In Section 5 the uncertainties of the binary space and time operators are evaluated. The high-energy approach to the space-time imprecisions will be performed in Section 6. The physical meaning of the electron radius, of the Compton wavelength, of the Bohr radius and of the Planck radius as natural units are also analysed. Except in self-evident cases, units will be chosen so that h = c = 1. 2. SHORT HISTORY OF THE SPACE-TIME QUANTIZATION Some opinions about the existence of the space-time quanta were expressed before the development of quantum mechanics (Poincare, 1913; Proca, 1928; Papp 31 Kaluza, 1921). During the period of the main development of quantum mechanics the idea of an atomistic structure of space-time had been explicitly formulated by Thomson (1926), Levi (1926), Pokrowski (1928), Latzin (1927), Beck (1929), Schames (1933) and others. In this respect a preferential meaning had been attributed to the electron and/or nuclear radius. Attempts were proposed to define a theory of the physical constants as a consequence of the existence of the space-time quanta and of the upper value of the elementary particle rest-mass (Beck, 1929; Schames, 1933). An essential step to take for assuming the existence of the ultimate accuracy of the space-time measure- ments in agreement with the mathematical formalism of quantum mechanics has been stimulated and supported by the Heisenberg (1927) uncertainty relations. In these conditions a conceptually more general approach to the space-time quantization has been formulated on the basis of the existence of the elementary space (h/m c) and time (h/m c 2 ) uncertainties by Ruark (1928), Hint and Richardson (1928), Fiirth (1929), Wataghin (1930), Landau and Peierls (1931), Glaser and Sitte (1934) and Hint (1937). It is also the amplitude of the Zitterbewegung (Schrodinger, 1930) which has been inter- preted as the- result of the existence of the individual space imprecision (Iwanenko, 1931). During this period fundamental problems concerning the connection between the structure of the elementary particles and the existence of the space-time quanta (Furth, 1929; Glaser and Sitte, 1934), the necessity of the synthesis between gravity and quantum theory (Fock and Iwanenko, 1929; Wataghin, 1932; Glaser and Sitte, 1934; Flint, 1935, 1937), the necessity of a more deep connection of electromagnetism and physical space-time descrip- tion (Flint, 1935; Moglich and Rompe, 1939) were analysed and discussed. Further progress in the analysis of the time-energy uncertainty relations are due to Madelstamm and Tamm (1945), Fock (1962), Fujiwara (1970) and Olkhovsky and Recami (1970), whereas certain objections concerning the meaning of the time-energy uncertainty relations were raised by Aharonov and Bohm (1964) and Bunge (1970). The space uncertainties have been evaluated for bound states by Remak (1931), and subsequently calculated for the interacting particles by Griffith (1974). A discrete space-time method has been used to evaluate the space-time uncertainties for the relativistic particles (Henning, 1956). Relativistic space uncertainties were calculated for fermions by Blokhintsev (1973). The uncertainty relations have been applied to the gravitational field by Peres and Rosen (1966) and Wheeler (1957) and to the electromagnetic field by Jordan and Fock (1930), Landau and Peierls (1931) and others. There is also the uncertainty-time operator which has been explicitly proposed for bound states by Eberly and Singh (1973). To overcome the divergence difficulties of the present quantum field theory, to suitably define the high-energy production processes and also to favour the development of the theory for predicting the elementary particle rest-masses attempts were proposed to introduce a fundamental length in the quantum theory by Heisenberg (1936, 1938a, 1938b, 1942), March (1936, 1937a, b, c), Ambarzumian and Iwanenko (1930), Markov (1940) and others. Generally this fundamental length has to take the value of the particle size. In agreement 32 Uncertainty Principle and Foundations of Quantum Mechanics also with the preferential meaning of the weak interactions (Heisenberg, 1938a) there are Kadyshevsky (1961) and Kim (1973) who have analysed explicitly the space constant of the weak interactions. In the papers cited above March advocates the necessity of a suitable redefinition of the short-distance geometry. In this sense certain contributions were also given by Wheeler (1957, 1962), Coish (1959), Takano (1961), Blokhintsev (1960, 1973), de Witt (I960) and others. To support the existence of the fundamental length, space-quantization approaches were performed by Snyder (1947) and Hellund and Tanaka (1954). In their approaches the space quantization is the result of a discrete space eigenvalue problem. Curved space approaches to the space quantization were also proposed (Yang, 1947; Flint, 1948). The compatibility between Lorentz invariance and the existence of the discrete space-time quanta has been analysed by Schild (1948) and Hill (1955). An alternative approach to the space quantization has been proposed by Darling (1950) who considers the irreducible volume character of events. Concerning the (cellular) discrete-space approach we have to mention the contributions given by Das (1960) and Peters (1974). It is significant that the present day non-linear, non-local, indefinite metric and higher derivatives field theories support in one way or another the existence of the fundamental length (see e.g. Vialtzew, 1965). There is also evidence for considering that the predictions formulated earlier by Heisenberg (1938a) concerning high-energy explosions are qualita- tively in agreement with the present day multiparticle production processes. Meaningful results were obtained by March (1941) in the description of the quantum-mechanical space-time measuring process. In this sense we have to consider that the quantum-mechanical measuring apparatus is essentially more complex and rather distinct from the one for relativity (March, 1937a). In this respect a first step in order to conceptually join relativity and quantum mechanics is to consider the reference frame as a component of the quantum- mechanical measuring apparatus (Wataghin, 1930). Certain difficulties con- cerning the co-existence of the fundamental length with the standard Lorentz invariance (Pavlopoulos, 1967) can be, at least qualitatively, overcome e.g. within the extended Lorentz invariance condition used by Schild (1948) and Hill (1955). However, there is evidence to conclude that the general theory of relativity is essentially more suitable for describing the extended particles than special relativity. These latter aspects were analysed by Motz (1962, 1972), Markov (1965, 1966), Penrose and MacCallum (1973), Sivaram and Sinha (1974), Lord and coworkers (1974) and others. In this connection the quantiza- tion of the gravitational field is of a special interest (see e.g. Wheeler, 1957; Treder, 1963; Brill and Gowdy, 1970). We may thus conclude that the problems raised by the space-time quantization have in fact not lost interest and opportunity since the appearance of quantum mechanics. Progress was also obtained in the definition of the space-time operators as in the performing of the collision time evaluations (see e.g. Kalnay, 1971; Almond, 1973). It has been proved that further developments of the quantum- mechanical space-time description needs the extension of the standard I Papp 33 quantum-mechanical formalism (Fick and Engelmann, 1964; Kalnay and Toledo, 1967; Broyles, 1970; Olkhovsky, Recami and Gerasimchuk, 1974 and others). There is also an increasing interest in analysing more deeply certain aspects of the very quantum-mechanical ('shell-pulsating') free-particle description (see e.g. Dirac, 1972). All the facts presented above permit us to assume that the quantum-mechanical space-time description — which is far from being completely resolved — is a fundamental problem characterizing all the steps in the evolution of the quantum theory. 3. THE BINARY CONTENT OF THE COLLISION-TIME DESCRIPTION The time spent by the outgoing (reduced) particle in the interaction region is given for / = by (Smith, 1960) o(a,p) =— +2^-S (p)-^-sm2[pa+S (p)] (1) v do> 2o) Tol so that the collision time-shift is r't\a,p)^r {a,py — = 2^-8 (p)-^-sm2[pa+8 (p)} (2) v do> 2<>) where a is the interaction radius, &> = p /2m and S (p) the phase-shift for / = 0. In agreement with the formal scattering theory we shall consider that the phase-shift does not explicitly depend on the interaction radius. In order to eliminate by all means the presence of the oscillating term a supplementary, outside of the theory, averaging device with respect to the interaction radius has been imposed by Smith (1960), Jauch and Marchand (1967) and Gien (1965). This averaging device is not only artificial but also physically meaning- less. On the contrary, the presence of the oscillating term has to be maintained in order to preserve the macroscopic causality condition (Wigner, 1955) and especially the causal positivity of the interaction-time evaluation (Papp, 1972a; Baz, 1966; Peres, 1966). In this respect we can already suppose that the presence of the oscillating term presents a fundamental theoretical meaning. Indeed, the interaction radius is not uniquely defined. More exactly, if a is the interaction radius, there is the larger value a'>a, too. In these conditions we can cause, at least formally, the last term of the time-shift (2) to oscillate, so that the punctual collision-time evaluation 2(d/do>) S (p) is in fact replaced by the interval 1 2-4-«p)+-M (3) >£ 8 ° ( ')- 2ft>' dw The width of this interval is independent from the dynamical peculiarities of the collision system. We can thus conclude that the actual observable meaning of the average 2((d/d<w) S (p)) can be suitably defined only within a certain range, 34 The Uncertainty Principle and Foundation of Quantum Mechanics whose largest value is given by (l/2«>. In these conditions we have to consider that the real purpose of the quantum-mechanical description is a double one: to perform the observable time-shift evaluations and to state theoretically the existence of an objective degree of accuracy of the time-shift measurement. The above results express the essential step in the definition of the binary description formalism. In this sense the binary interval (3) describes the measurement in which the observable evaluation 2<(d/d<o) 8 (p)) is obtained within the imprecision <l/2o>>. The fact that the above imprecision is twice larger than the one of <l/4<») previously calculated (Papp, 1974a) can be explained noticing that the present imprecision does not refer to a single binary time-shift variable, but to the difference of two binary variables. We can also remark that the binary description formalism is consistent with the starting conditions concerning the necessity to assure the fulfilment of the macroscopic causality condition expressed by the positivity requirement of the interaction- time evaluation. Indeed, the inequality \a \ > /3 expresses both the macroscopic causality condition and the necessary condition the binary variable a - ifi to possess measurable meaning. The above discussions preserve their meaning in the relativistic case, too. Thus, the outgoing time-shift of the elastically scattered Klein-Gordon particle is given, in the one-dimensional case, by (coll) TO (a, p) = 2— 5 (p)+^§ sin 2[pa+ 8 (p)] apo P (4) where p = Vp 2 + /no (Gien, 1965). Similarly to the non-relativistic case we are now able to define the existence of the relativistic binary time-shift description with the imprecision given by <p /2p 2 }. For the Dirac-particle one obtains the result (coll) TO (a, p) = 2— S (p)-^r sin 2[pa+S (p)] apo P (5) thus stating the existence of the particular time imprecision <m /2p ). In agreement with point (/) of the binary description formalism (Papp, 1973) we can see that the above obtained time imprecisions are binarily 'equivalent': ^2^ 2p 2 2p 2 (6) up to the threshold velocity (v3/2)c. We have to mention that a similar concourse of events to the above one arises when comparing the results obtained for the lower bound of the phase-shift derivative by Wigner (1955) and by Goebel, Karplus and Ruderman (1955), respectively. The results so obtained allow us to conclude that the space-time imprecisions are essentially inner elements of the quantum-mechanical description. Papp 35 4. THE BINARY SPACE-TIME DESCRIPTION In the application of the correspondence principle there are cases when the resulting operators are not directly hermitian ones. To avoid the introduction of the non-hermitian operators, subsequent symmetrization devices were used. In line with Section 3 we shall consider that such symmetrization devices are in fact outside the proper theory. Consequently, the symmetrized operators cannot be principally used without also allowing the existence of the initial non-hermitian operators as physically meaningful. In such conditions we have to consider the initial non-hermitian operators as the binary operators which are able to originate the standard hermitian ones. Thus the classical expressions for the projection of the position vector on the momentum direction and of the free evolution time corresponding to that direction are 1 mo p-r and f p =— rp (7) respectively, where p = |p|. Applying directly the correspondence principle we obtain the pair of the mutually conjugated binary space operators ''-$• ''-*-©■' ,8> and the associated pair of the binary time-operators t„ — /nor (4), r p =f*-m (ft)' (9) respectively. We then easily obtain, in agreement also with Lippmann (1966), the hermitian space-time operators as ff-kfp + f'p) and ff^Wp + t'p) (10) respectively. Averaging the binary operators with respect to the non- relativistic wave packet <p(r, t) = (2ir)~ 1/2 I dpa(p) exp i(p • r- cat) (11) where / is the spatial dimension, there results w--(^»*«w} + '<'- 1 >(£) (12) and < f P >=-^-i arga(p) ) +/ ' (/ - 2) (i) (13) thus allowing the definition of the space and time imprecisions as the imaginary parts of the above averages. In order to define the time operator for/ = 1, the 36 Uncertainty Principle and Foundations of Quantum Mechanics boundary condition lim pi pi->0 1/2 a(/>i)<°° (14) is needed whereas for j = 2 and / = 3 the wave-packet form-factor has to be only bounded at the origin (Papp, 1974c). However, for / = 1, appreciable limitations are not implied as the subspace denned by the well-behaved condition (14) is dense in the whole Hilbert space. The above boundary conditions maintain their meaning also in the relativistic case. There is also a mutual connection between the binary description of the action and the one of space-time (Papp, 1974a). Indeed, the r-p-action average is given by (15) < r .p>=-(p.A arga(p) ) + ,(^) +/ . so that the binary description of the action with the imprecision /(ft/2) implies the existence of the binary description of the time with the imprecision given by ;'<ft/4<u> and vice versa. Alternatively, there is implied a binary description of the space shift or of the time shift (p'i arga(p) ) m (-^- — arga(p)^ with the imprecisions given byj(h/2p) and/<ft/4a>>, respectively. We can thus conclude that the purpose of the quantum-mechanical binary space-time description is indeed a double one: to perform the observable space-time (shifts) evaluations and to define the imprecisions of the space-time (shifts) measurements. In the relativistic case the binary space, time and action (p t - p • r) operators are given by ''—(£)• '>'•(■?) and *-"*■(£) (16> respectively. Averaging the action operator a with respect to the Klein- Gordon wave packet <D (+) (r, t) = (2tt)- >/2 f d PT Ui (+) (p) exp (-ipx) J v2p where px=p t~P' r, one obtains (17) (18) Papp 37 so that the implied space-shift and time (time-shift) imprecisions are ■ W -(/-2>(£) ("> 8«>s and ^ = (/-2)(^) (20) respectively. Requiring now the action imprecision to be larger than ft/2, it results that the relativistic binary description of the action maintains ft/ 2 as the natural unit of the action only when m c ft ft 2p 2 ~2 (21) i.e. when the existence of the threshold velocity (v2/2)c is quantum- mechanically allowed. In this respect we have to consider that for velocities larger than (V2/2)c, the elastically scattered particle (in the centre-of-mass system) ceases to preserve the initial single free-particle individuality. As a consequence we can no longer igno re the structure of the particle, so that — in the high-energy region (t>>(V2/2)c) — the particle is in fact replaced by the system of its constituents. In this respect we shall attribute a fundamental theoretical meaning to the threshold velocity (Jl/2)c, in agreement also with the fact that there is the same threshold velocity value which has been obtained within general relativity theory (Jaffe and Shapiro, 1972). We can also remark that the existence of the natural unit of the action implies the existence of the natural space unit 8<s = 2m c or, alternatively, of the natural time unit 8 J ■J2z 2m.oC (22) (23) which is binarily 'equivalent' to the constant h/2m c 2 . In such conditions the Compton wavelength ft/2m c can also be interpreted as the extent of the spatial localization region of the free particle. This interpretation is in agree- ment with the fact that the spatial localization (overlapping) of the free-particle field operators at two points in space is of the same order as the Compton wavelength (see e.g. Schroder, 1964; Griffith, 1974). However, the existence of factor V2 in the expression (23) needs some additional explanations. Firstly we have to mention that generally the threshold velocities implied by the space imprecisions are not identical with the ones corresponding to the time impreci- sions (Papp, 1973). On the other hand a special juncture arises when we compare the results of Schild (1948) and Hill (1955). Allowing the existence of 38 Uncertainty Principle and Foundations of Quantum Mechanics the space-time quanta and imposing the (extended) requirement of Lorentz invariance it is implied that there is only a certain set of allowed velocities which corresponds to the existence of the integral-number co-ordinates (Schild) and of the rational-number co-ordinates (Hill), respectively. But in the particular case when the rational number is also an integral one, the above sets are not — as one would expect— identical. In such a situation it would be justifiable to conclude that— in the high-energy region — the existence of a certain velocity allowance cannot be overcome (when also imposing the Lorentz invariance condition within approaches supporting the existence of the space-time quanta). Averaging the binary time operator one obtains the imprecision S (i) t = (j <%Hw) (24) which is larger than the expression (20) by the imprecision amount (l/2p ). As (25) _P°_< 2 — 2p 2 2p for v > (V2/2)c, we may conclude — by virtue of the above binary 'equivalence'— that the average <l/2p > expresses in fact the time imprecision in the high-energy region. The binary space operator leads to the same space imprecision of (y - l)<l/2p). Analysing the binary meaning of the action operator tp -t • p, where t is now the time parameter, one obtains the action imprecision (y'-l)fi/2, the time-shift imprecision n-u-ife.) (26) the time-imprecision Po \2m% (27) (28) and the space-shift imprecision Excepting the dimensional factor (y - 1), it can be easily remarked that, at the threshold velocity (\/2/2)c, the action and both the time imprecisions become identical with the ones of (18) and (22), respectively. Similarly, the r • p-action leads to the imprecisions thus confirming again, at least 'binarily', the non-equivocal values of the space-time imprecisions. I The action operator (p , t p ) possesses the imprecision J 0) a = list \ 2p 2 2m \ Papp 39 (30) which implies, now in a separate direct way, the existence of the threshold velocity (V2/2)c for/ = 1. It is worthwhile mentioning that the action operator tp possesses the imprecision h/2. This operator is a binary one only for the Klein-Gordon particle. For the Dirac particle the binary action and space-time quantizations can be similarly performed. The imprecision of the binary proper-time operator is essentially given — now in a manifestly Lorentz invariant way — by the Compton wavelength (Papp, 1972b). In this respect further evidence is also given concerning the meaning of this length as the natural space constant of the free particle. In this case the following space- and time-shift imprecisions are implied *'"<=>(£)• s<l Hi) 2pl for the Dirac-particle, whereas ^-</-i>(£). a».-(/-i)(-L) i P i (3D (32) are the space- and time-shift imprecisions for the Klein-Gordon particle. The implied time (t) imprecisions are given by ^m 2 ) and (y \2mJ (33) respectively. The imprecision (p /2m 2 t ) is binarily compatible with the one of (po/2p 2 ) in the high-energy region: Po s2 Po 2p 2m (34) for v > (v3/3)c. This latter threshold velocity agrees numerically with the one which has been defined in the general relativity theory, too (Treder, 1974). We may thus conclude that the existence of certain common threshold velocities allow us to assume that in fact some premises needed by a properly unified theory of space-time and matter have already been fulfilled. S. THE UNCERTAINTIES OF THE BINARY SPACE-TIME OPERATORS There is a certain formal analogy between the binary space-time description and the one of the space and time uncertainties. Indeed in both cases the existence of a certain interval associated with the measuring process is 40 Uncertainty Principle and Foundations of Quantum Mechanics considered. But whereas in the first case the interval is the primary element of a theoretical description which has to express by itself the 'objective' imprecision of the measurement, the uncertainty interval (centred around the mean value, too) expresses in a rather conventional manner the general statistical accuracy limits of the measurement. In spite of these essential distinctions, we shall prove that certain space and time uncertainty contributions— the so-called uncertainty units— can be placed on the same footing as the space and time imprecisions (Papp, 1975). This fact is valid not only for the binary space-time operators, but also for the hermitian one-dimensional space operators. Let us begin with the evaluations of the space uncertainty for the Dirac particle. Using the relations (P.A)(P.A) ex p ip . r =_J 5 (p. r ) 2 expip-r \p dp/\p dp/ p where u(p, s) is the positive energy spinor, there results and <M p ? >e><->')>(i> .((e.A arg6(p , 5) ))- 2 ^.A argt(p , s) ) -<;-i)(/-2)(^) where the /'-dimensional Dirac particle wave packet ^ (+) (r, s, t) = Git)"" 2 f dp\p\(p, s) exp (-ipx) has been used. Allowing for simplicity the approximations + +1 (35) (36) (37) (38) (39) (40) (41) Papp 41 and there results fr^M-®<H««'fc'>> (42) Ar^<#-ft> 2 -A«V+Ar ( * >a pj +^ r PJ (unit) 2 where *s B, -<(H'-i»M> is the square of the minimum space uncertainty, (43) (44) (45) is the square of the space uncertainty unit and where Av is the square of the velocity uncertainty. The space uncertainty contribution Avt can be neglected taking formally t = 0. Using the Kronecker symbols, the expression (45) takes the form We can easily remark that in the two-dimensional case there arises an addi- tional contribution due to the space imprecision (l/2p). For/' = 1 and/' = 3, one obtains .2, a (unit) " • pJ 2m c ('-€*»» 4m c , 7 = 1,3 (47) for {v 2 )<c 2 /2. The above results confirm our assumption that the space- uncertainty unit possesses the physical meaning of the space imprecision. The calculations can be similarly performed for the Klein-Gordon particle, thus obtaining the results Arr-=(3-/)0-l){^)-(^) (48) so that Ar (unit)* PJ ^M$y°»m^) (49) We can see that in the one- and three-dimensional cases the space uncertainty unit of the Klein-G ordon particle takes the maximum imaginary value at the threshold velocity (V2/2)c: ^""'"-w- '"" (50) 42 Uncertainty Principle and Foundations of Quantum Mechanics In these conditions the minimum space uncertainty of the Klein-Gordon particle has to be generally smaller than half of the Compton wavelength. The above requirement would also signify that the (elastically scattered) Klein- Gordon particle is not necessarily a free one in the high-energy region. Indeed, the one-dimensional space uncertainty evaluation has to take positive values as the space operator is hermitian. On the other hand, if we require the minimum space uncertainty ft/2Ap to be larger than fc/4m c, it follows that Ap < 2m c. In these conditions, in order to assure— irrespective of any particular cases— the general validity of the quantum-mechanical description, we have^also to consider the condition (p)<2m c, as there exist cases when Ap = <p ). Con- trarily, there would exist cases for which Ap>2m c. Consequently the Compton wavelength is able to express the size of the Klein-Gordon particle only in the free case of a not too large energy (in order to also preserve the initial single particle individuality). In the non-relativistic case the square of the space-uncertainty unit is Arjr=o-- -te) (51) so that the space-uncertainty unit is 'binarily' identical to the space imprecision U/2p>. t u . . u The time-uncertainty units can be similarly calculated, thus obtaining the results (Papp, 1975) and 0/3 (52) (53) (54) for the Dirac, Klein-Gordon and non-relativistic particles, respectively. Besides the space imprecision <l/2p), there is now the previously encountered collision time-shift imprecision (m /2p 2 ) which is also implied. It can now be easily shown that the squares of the space and time uncertainty units of the Dirac particle are larger than the corresponding squares of the Klein-Gordon particle by the amounts <l/4po) and <l/4p 2 ), respectively. These results are in fact in agreement with the expressions (31) and (32) thus proving, in this way also, the general inner consistency of the binary description. There is also a mutual compatibility of the space-time imprecisions with the space-time uncertainty units. We can show that there is not an irreconcilable difference between the space and time imprecisions. Thus the average (l/2p) is not only the space impreci- sion, but it possesses the meaning of a time imprecision, too. Similarly, the average (l/2p ) possesses also the meaning of a space imprecision. i Papp 43 6. THE HIGH-ENERGY SPACE-TIME IMPRECISION DESCRIPTION Up to now the meaning and relevance of the Compton wavelength with respect to the (non-large energy) free (scattered) particle has been analysed. Proofs were given that the space-time imprecisions possess a well-defined physical significance within the quantum-mechanical collision time shift, binary space- time and space-time uncertainty descriptions. We shall now perform an approach which is able to define — in a unitary and direct manner — the Compton wavelength, the electron radius and (by extrapolation) the Bohr radius as natural space-time units. For this purpose let us assume that there is a 'spectrum' of the relevant space shift evaluations which correspond to the various levels of the quantum- mechanical description of matter. Such a space-shift 'spectrum' has to be described by the generalized 'eigenvalue' equation T P 8 '^ N i (55) where N is the space-multiplicity parameter ('eigenvalue') and where, for convenience, a well-defined value of the angular momentum has been chosen. We shall also subsequently consider that the production processes which are expected to arise in the high-energy region can be qualitatively supported by a resonance emission approximation. The equation (55) has to define by itself the physical meaning of the space imprecision in the high-energy region. In this sense the equation (55) has to establish a close connection between the existence of the natural space-time units and the high-energy structural effects raised by the validity of the resonance emission approximation. Consequently, there is also implied a high-energy interacting-particle approach, as the collision interaction can be qualitatively supported by the formation and subsequent decay of a resonance state (Peres, 1966). In such conditions the above space shift has to be also considered as an interaction space shift (Papp, 1972a). One would from the very beginning expect that among the physically relevant N-values there are the ones of N= 1 and N = 3 too. Indeed, in agreement with points (b) and (d) of the previous paper (Papp, 1973) the necessary condition for the binary variable to possess measurable meaning is N>1, whereas N>3 has to be considered the sufficient condition for the binary variable to possess (the well-defined) measurable meaning. Besides the above-formulated approach to the high-energy space-shift description, we can define another variant, by using the high-energy time imprecision (l/2p ). Starting from the (space-) time imprecision behaviour of the time shift -£-a,(p )=N-i- dp 2p (56) 44 Uncertainty Principle and Foundations of Quantum Mechanics one obtains the phase shift S,(po)=yln-^ (57) I mo where the present N-parameter is not necessarily identical with the one of equation (55). We shall consider for convenience— in agreement with the previous remarks— th e relation ( 56) as a space-shift relation, i.e. we shall take (dimensionally) p = Jp 2 +mlc 2 . The scattered state function corresponding to the above phase shift is given (in the energy representation) by <pT tt \p ) = gi(Po) sin 5,(po) exp iS,(p ) (58) Neglecting the influence of the wave-packet preparation, we shall take the form factor g,(p ) to be a constant. Consequently, the interaction time impreci- sion takes the form (J_)= dposin^tpo) dp — sin 2 fi,(po) (59) \2/V LJ mo J J ""> ZP ° / 1 \ h tt(N 2 +\) ( 2tr \- 1 so that (60) where we have assumed that 5 ( (p )e[0,7r],pr x) = It can now be easily verified that 2it m exp — <£* w > : m c (61) (62) for N s: 1, where instead of e 2 /moC 2 we can also consider— without appreciably affecting the above approximation — the twice smaller value e /2m c . Simi- larly it results in (^■w) 5 — <63) (64) 2m c for N> 3. The existence of the inequality (^-5,(po))>— , N^4 \dp I rn c agrees with the binary description formalism. Indeed, the time-shift evalua- tions for N = 3 and JV = 4 are binarily equivalent ones, as there are the space constants h/2m c and h/m c, too. We may thus conclude that the existence of the Compton wavelength and of the electron radius as the natural space units is a direct result of the binary description formalism of the space and time. In order to perform the resonance emission approximation we have to impose the maximum-value condition of the scattered state function (Kilian Papp 45 and Petzold, 1970) W) M\ = 7T so that the resonance energies are given by pV* - „ w^) - -Po m exp N On the other hand the energy average is given by . w moc N 2 +l ( 4v \( 2tt X" 1 (65) (66) (67) so that the meaning of the narrow resonance approximation can be analysed in this way also comparing, for the same iV-values the expressions (66) and (67). Thus = 23m , p r,3) ^2.87m , p rA) ^2.26m (68) PO = whereas (69) <p ) (1) -107m , (po) (3) ^3.12m , <p ) (4) - 2.46m Consequently, the above resonance emission approximation is mathematically consistent for the relatively larger Af-values, but it could be qualitatively accepted, in a larger sense, even for 7V= 1. Around the value N= 1, a very small variation of the N-parameter implies large variations of the (p ) and (1/po) averages. In these conditions the inequality <Po>sl07m (70) which is valid for Af? 1, can be practically replaced by the inequality (po)sl37m , Nsl (71) Consequently, the electron radius is able to fulfil its role both as natural space unit and as intrinsic size of the interacting electron only up to the 'elec- tromagnetic' threshold velocity v (em) (Papp, 1974b). As a consequence of the inequality (62), a lower bound of the N-parameter values can be defined. Indeed h \2pJ <- 2m c so that Consequently N- >(— «i(p»)): 2m c \dpo 2m c he 137 (72) (73) (74) 46 Uncertainty Principle and Foundations of Quantum Mechanics On the other hand, from the condition (61), one obtains 2tt p <m exp — <Po (max) _ 2tt = m exp — a (75) The so-defined upper energy bound agrees qualitatively with the upper cutoff momentum defined by Greenman and Rohrlich (1973). The extreme extrapo- Stion N-»« implies not only the breakdown of the high-energy space-time miction) imprecision description, but also the breakdown of the linear quantum electrodynamics, as that extrapolation leads to the appearance of the deep non-linear effects. ... » r i a „ tk,.. Another extrapolation can be performed towards the large N-values. Thus requiring the interaction shift to equal the Bohr radius: jC 2m e (76) it results that N=-=137 a (77) But the value 1/a is in fact an upper bound of the JV-parameter, as the energy average becomes in this case practically identical to the rest-mass energy. In the present case the Bohr radius is also a natural unit, but now with respect to another quantum-mechanical stratum of the bound states of the atomic electron. Indeed, the space-imprecision average performed with respect to the p-momentum representation state function of the hydrogen atom is given, for / = 0,by \2pf T^fi 1 A£i2k-1 4n*-lJ where n is the main quantum number. The smallest space imprecision is now given by 8h 2 /3irm e. This result confirms the above assumption concerning the role of the Bohr radius as a natural space unit. There is formal analogy between the electrostatic and gravitational interac- tions of two point particles. Thus, whereas the Compton wavelength maintains unchanged its role as natural space unit, there is the Schwarzschild radius g(m /2c 2 )— where g is the gravitational constant— which corresponds toth^ classical electron radius. In these conditions the Planck radius (l/2c)Vftg/c (Planck 1913) is even the geometrical average of the so defined gravitational space units. It is justifiable to consider that the Planck radius is of a fundamen- tal theoretical significance (see e.g. Wheeler, 1957; Treder, 1963; Markov, 1966- Motz 1972). In this sense this radius depends explicitly unly on the universal constants and it is also the space constant which takes the smallest value From the quantum-mechanical point of view we can consider— in a strong analogy with the results obtained for the electrostatic field by Heisen- berg and Euler (1936)— that the Planck radius is mutually connected with the Papp 47 existence of the maximum observable value of the gravitational field strength. The meaning of the Planck radius as the critical distance value of the quantum gravity is in agreement with this latter result. We may thus conclude that the above-analysed natural space-time units are in fact various aspects of the same space-time imprecision ((l/2p) or <l/2p ))- The existence of the natural space-time units is actually required even by the mathematical binary description formalism, thus also proving both the rele- vance and the consistence of that formalism. In these conditions certain steps which are needed by a mathematically suitable description of the natural space-time units have been established, at least qualitatively. 7. CONCLUSIONS Throughout this paper certain evidence concerning the existence of the space- time imprecisions as inner elements of an extended quantum-mechanical description has been analysed and discussed. Proofs have also been given that the existence of the natural space-time units is mathematically consistent with the binary description formalism. It turns out that the binary formalism expresses essential aspects of the quantum-mechanical description of space- time and matter. The space-time imprecisions — which were removed from the standard quantum-mechanical description — imply a quite natural extension of quantum mechanics. The so extended formalism fulfils — at least for the moment — the requirements needed to define a quantum theory of the natural space—time units. One would also assume the present binary description formalism to be not complete, so that further developments need subsequent extensions and refinements. Thus the binary description formalism cannot be adequately applied to the coulombian or to the static gravitational interactions (between two point particles) without additionally assuming the existence of a certain discrete-space model (Papp, 1974b). Indeed, we can assume that the experimental conditions to perform measurements on the short distance behaviour are more restrictive than the ones required by the usual large distance measurements. In this sense we have to consider that the space discretization methods imply certain additional restrictions needed to perform adequately the short distance measurements. The possibility exists to define the existence of the maximum observable value of the electrostatic field strength and of the upper bounds of the particle rest-mass and electric charge, too. All these facts led us once again to conclude on the relevance and the deep physical significance of the space-time binary description formalism. REFERENCES Aharonov, Y. and Bohm, D. (1964) 'Answer to Fock concerning the time-energy indeterminacy relation', Phys. Rev., 134 B, 1417-1418. 48 Uncertainty Principle and Foundations of Quantum Mechanics Almond D (1973) 'Time operators, position operators, dilatation transformations and virtual particles in relativistic and nonrelativistic quantum mechanics', Ann. Inst. Henn Pomcare, 19 A, Arnbt^mian, V. and Iwanenko, D. (1930) 'Zur Frage nach Vermeidung der unendlichen Selbstrfiskwirkung des Elektrons'.ZPnys., 64, 563-567 Baz A I (1966) 'Life-time of intermediate states , Yader. Fiz., 4, 232-iou. Beck, G. (1929) 'Die zeitliche quantelung der Bewegung', Z. Phys., 51, 737-739. Blokhintsev, D. I. (1960) 'Fluctuations of space-time metric', Nuooo Omenta, lfc, mz-ik/. Blokhintsev! D.I. (1973) Space and Tune in the Microworld, Dordrecht, Boston. ,_.,_. Bohm, D., Hilley, B. J. and Stewart, A. E. G. (1970) 'On a new model of description in physics , Int. J. Theo'r. Phys., 3, 171-183. . . Brill, D. R. and Gowdy, R. H. (1970) 'Quantization of general relativity , Rep. Prop. Phys., 33, 413-488 Broyles, A.' A. (1970) 'Space-time position operators', Phys. Rev 1 D, 979-988 Bunge, M. (1970) 'The so-called fourth indeterminacy relation', Can. J. Phys., 48, 14 ^ 14U - Coish H R (1959)'Elementaryparticlesinafinite world geometry', Pnys. Rev., 114,383-388. Darling, B. T. (1950) 'The irreducible volume character of events. A theory of the elementary particles and of fundamental length', Phys. Rev., 80, 460-466. Das A (1960) 'Cellular space-time and quantum field theory', Nuovo Omenta, 18, 48Z-5U4. Dirac, P. A. M. (1972) 'A positive energy relativistic wave equation', Proc. Roy. Soc. London, \ja a 1—7 Eberly, J. H. and Singh, L. P. S. (1973) 'Time operators, partial stationarity and the energy-time uncertainty relation', Phys. Rev., 7 D, 359-362. Fick, E. and Engelmann, F. (1964) 'Quantentheone der Zeitmessung , Z. Phys., 178, 551-562. Flint, H. T. (1935) 'A relativistic basis of the quantum theory', Proc. Roy. Soc. London, 150 A, 421-441. Flint, H. T. (1937) 'Ultimate measurements of space and time', Proc. Roy. Soc. London, 159 A, Flint H T. (1948) 'The quantization of space and time', Phys. Rev., 74, 209-210. Flint' H T and Richardson, O. W. (1928) 'On a minimum proper time and its applications (1) to the number of chemical elements, (2) to some uncertainty relations', Proc. Roy. Soc. London, 117 A, 637-649. ...... ^ a Fock V A. (1962) 'Criticism of an attempt to disprove the uncertainty relation between time and energy', Zh. Eksp. Teor. Fiz., 42, 1135-1139. ^ 17 m Fujiwara, I. (1970) 'Time-energy indeterminacy relationship', Prog. Theoret. Phys., 44, 1701- Fiirth R (1929) 'Uber einen Zusammenhang zwischen quantenmechanischer Unscharfe und Struktur der Elementarteilchen und eine hierauf begrundete Berechnung der Massen von Proton und Elektron', Z /Viys., 57, 429-446. Gien, T. T. (1965) 'Relativistic formulation of the lifetime matrix in the potential theory ot collision', /. Matfi. P/iys., 6, 671-676. Glaser, W. and Sitte, K. (1934) 'Elementare Unscharfe, Grenze des penodischen Systems und Massenverhaltniss von Elektron und Proton', Z. Phys., 87, 674-686. Goebel, G. J., Karplus, R. and Ruderman, M. A. (1955) 'Momentum dependence of phase shifts , Phys. Rev., 100, 240-241. Greenman, M. and Rohrlich, F. (1973) 'Is there a maximal electrostatic field strength? , Phys. Rev., 8 D, 1103-1109. . . . , Griffith, R. W. (1974) 'Explicit formula from field theory for the average intrinsic size of a real or virtual photon', Nuovo Omenta, 21 A, 435-470. Heisenberg, W. (1927) 'Uber den anschaulichen Inhalt der Quantentheoretischen Kinematik und Mechanik*, Z. Phys., 43, 172-198. Heisenberg, W. (1936) 'Die selbstenergie des Elektrons'. Z. Phys., 65, 4-13. Heisenberg, W. (1938a) 'Uber die in der Theorie der Elementarteilchen auftretende umverselle Lange', Ann. Phys. Lpz., 32, 20-33. «...,.■.,. Heisenberg, W. (1938b) 'Die Grenzen der Anwendbarkeit der bishengen Quantentheone , Z. Phys., 110, 251-266. _ . j m t ^ ., . , „ Heisenberg, W. (1942) 'Die "beobachtbaren Grossen" in der Theone der Elementarteilchen , Z. Phys. 120, 513-538. L ^ . . D . Heisenberg, W. and Euler, H. (1936) 'Folgerungen aus der Diracschen Theone des Positrons , Z Phys., 98, 714-732. Papp 49 Hellund, E. J. and Tanaka, K. (1954) 'Quantized space-time', Phys. Rev., 94, 192-195. Henning, H. (1956) 'Die Unscharferelation in der Dirac-Gleichungen und in der relativistischen Schrodinger-Gleichung', Z. Naturforsch., 11 A, 101-118. Hill, E. L. (1955) 'Relativistic theory of discrete momentum space and discrete space-time', Phys. Rev., 100, 1780-1783. Iwanenko, D. (1931) 'Die Beobachtbarkeit in der Diracschen Theorie', Z. Phys., 72, 621- 624. Jaffe, J. and Shapiro, 1. 1. (1972) 'Lightlike behaviour of particles in a Schwarzschild field', Phys. Rev., 6 D, 405-406. Jauch, J. M. and Marchand, J. P. (1967) 'The delay time operator for simple scattering systems', Helv. Phys. Acta, 40, 217-229. Jordan, P. and Fock, V. (1930) 'Neue Unbestimmtheitseigenschaften des elektromagnetischen Feldes', Z. Phys., 66, 206-209. Kadyshevsky, V. G. (196 1) 'On the theory of quantization of space-time', Zh. Ekspenm. Teor. Fiz., 41, 1885-1894. Kalnay, A. J. (1971) The Localization Problem, Studies in the Foundations Methodology and Philosophy of Science, Springer- Verlag, Berlin, 4, 93-100. Kalnay, A. J. and Toledo, B. P. (1967) 'A reinterpretation of the notion of localization', Nuovo Omenta, 48, 997-1007. Kaluza, T. (1921) 5. B. Akad. Wiss. Berlin, 966. Kilian, H. and Petzold, J. (1970) 'Zur Begriindung der Gamowschen Zerfallsthorie', Ann. Phys. Lpz., 24, 335-355. Kim, D. Y. (1973) 'A possible role of universal length in the theory of weak interactions', Can. /. Phys., 51, 1577-1581. Landau, L. and Peierls, R. (1931) 'Erweiterungdes Unbestimmtheitsprinzip fur die relativistische Quantentheone', Z. Phys., 69, 56-69. Latzin, H. (1927) 'Quantentheone und Realitat', Naturwissenschaften, 15, 161. Levi, R. (1926) 'L'atome dans la theorie de Taction universelle et discontinue', C.R. Acad. Sci. Paris, 183, 1026-1028. Lippmann, B. A. (1966) 'Operator for time delay induced by scattering', Phys. Rev., 151, 1023-1024. Lord, E. A., Sinha, K. P. and Sivaram, C. (1974) '"Cosmological" constant and scalar gravitons', Progr. Theoret. Phys., 52, 161-169. Ludwig, G. (1972) 'An improved formulation of some theorems and axioms in the axiomatic foundation of the Hilbert space structure of quantum mechanics', Commun. Math. Phys., 26, 78-86. Mandelstamm, L. and Tamm, I. (1945) /. Phys. U.S.S.R., 9, 249. March, A. (1936) 'Die Geometrie kleinster Raume', Z. Phys., 104, 93-99, 161-168. March, A. (1937a) 'Zur Grundlegung einer statistischen Metrik', Z. Phys., 105, 620-632. March, A. (1937b) 'Statistische Metrik und Quantenelektrodynamik', Z. Phys., 106, 49-69. March, A. (1937c) 'Die Frage nach der Existenz einer kleinsten Wellenlange', Z. Phys., 108, 128-136. March, A. (1941) 'Raum, Zeit und Naturgesetze', Z. Phys., 117, 413-436. Markov, M. (1940) 'On the four "dimensionally" stretched electron in a relative quantum region', Zh. Eksperim. Teor. Fiz., 10, 1311-1338. Markov, M. A. (1965) 'Can the gravitational field prove essential for the theory of elementary particles?', Suppl. Progr. Theoret. Phys. (extra number), 85-95. Markov, M. A. (1966) 'Elementary particles with largest possible masses (quarks and maximons)', Zh. Eksperim. Teor. Fiz., 51, 878-890. Moglich, F. and Rompe, R. (1939) 'Uber einige Folgerungen aus der Existenz eines kleinsten Zeitintervalles', Z Phys., 740-750. Motz, L. (1962) 'Gauge invariance and the structure of charged particles', Nuovo Omenta, 26, 672-697. by. Mote, L. (1972) 'Gauge invariance and the quantization of mass (of gravitational charge)', Nuovo Omenta, 12 B, 239-255. Olkhovsky, V. S. and Recami, E. (1970) 'About a space-time operator in collision description', Lett. Nuovo Omenta, 4, 1165-1173. Olkhovsky, V. S., Recami, E. and Gerasimchuk, A. J. (1974) 'Time operator in quantum mechanics (I. Nonrelativistic case)', Nuovo Omenta, 22 A, 263-278. Pa PP, E. (1972a) 'Interaction time measurement and causality', Nuovo Omenta, 10 B, 69-78. 50 Uncertainty Principle and Foundations of Quantum Mechanics Papp, E. (1972b) 'The non-relativistic limit of a dynamical proper time', Nuovo Cimento, 10 B, Papp, E. (1973) 'Peculiarities of the quantum-mechanical space-time description', Int. J. Theoret. Phys 8 429—441. Papp E (1974a) 'Field theoretical space-time quantization', Int. J. Theoret. Phys., 9, 101-115. Papp e! (1974b) 'Imprecision description of the high-energy annihilation and production proces- ses', Int. J. Theoret. Phys., 10, 123-143. , Papp, E. (1974c) 'An extended approach to the field theoretical time operators , Int. J. Ineoret. Phys., 10, 385-389. .. . , . Papp, E. (1975) 'Meaning and bounds for the space and time uncertainty contributions , Ann. Pavlopoulos "t. G. (1967) 'Breakdown of Lorentz invariance', Phys. Rev., 159, 1 106-11 10. Penrose, R. and MacCallum, M. A. H. (1973) 'Twister theory: An approach to the quantisation of fields and space-time', Phys. Rep., 6 C, 243-315. ,„„. _ .„ ,„ Peres, A. (1966) 'Causality in S-matrix theory', Ann. Phys. (N. Y.), 37, 179-208. Peres, A. and Rosen, N. (1966) 'Quantum limitations on the measurement of gravitational fields , Phys. Rev., 118, 335-336. ,. „,„„ s Peters, P. C. (1974) 'Propagation in a space-time lattice , Phys. Rev., 9 D, 3223-32/8. Planck, M. (1913) Vorlesungen iiber die Theorie det Warmestrahlung, Johann Ambrosius Bartn, Leipzig, 167-169. Poincare, H. (1913) Dernieres pensees, Flaramarion, Pans. „,_,,„ Pokrowski, G. I. (1928) 'Zur Frage nach der Struktur der Zeif, Z. Phys., 51, 737-739. Proca A (1928) Sur la Theorie des Quanta de Lumiere.Blanchaid, Paris. Remak, B. (1931) 'Zwei Beispiele zur Heisenberg-schen Unsicherheitsrelation bei gebundenen Teilchen', Z. Phys., 69, 332-345. . Ruark, A. E. (1928) 'The limits of accuracy in physical measurements , Proc. Nat. Acad. aci. Was'h., 14, 322-328. ., ^ ., , ™. «1 nn_->89 Schames, L. (1933) ' Atomistische Auffassung von Raum und Zeit , Z. Phys., »1, z/u-zsz. Schild, A. (1948) 'Discrete space-time and integral Lorentz transformations', Phys. Rev., 73, 414—415 Schrddinger, E. (1930) 'Uber die kraftefreie Bewegungin der relativistischen Quantenmechanik', Berliner Berichte, 418-^28. ...... t- ,^u • > Schroder, U. E. (1964) 'Lokalisierte Zustande und Teilchenbild bei relativistischen Feldtheonen , Ann. Phys. Lpz., 14, 91-112. . . Sivaram, C. and Sinha, K. P. (1974) 'Gravitational charges and the quantization of mass, Lett. Nuovo amenta, 10,227-230. Smith, F. T. (1960) 'Lifetime matrix in collision theory', Phys. Rev., 118, 349-356. Snyder, H.S. (1947) 'Quantized space-time', Phys. Rev., 71, 38-41. Takano, Y. (1961) "The singularity of propagators in field theory and the structure of space-time , Prop. Theoret. Phys., 26, 304-314. Thomson, J. J. (1925/ 1926) 'The intermittence of electric force', Proc. R. Soc. Edinb., 46, 90-1 15. Treder, H. (1963) 'Gravitonen', Fort. Phys., 11, 81-108. Treder, H. (1974) 'Gravitationskollaps und Lichtgeschwindigkeit im Gravitationsfeld', Ann. Phys. Lpz., 31, 325-334. Vialtzew, A. N. (1963) 'Discrete space and time' (in Russian), Isd. 'Nauka', Moskwa. Wataghin, G. (1930) 'Uber die Unbestimmtheitsrelationen der Quantentheorie', Z. Phys., 65, 285-288. ,Wataghin, G. (1932) 'Zur relativistischen Quantenmechanik', Z. Phys., 73, 121-129. Wheeler, J. A. (1957) 'On the nature of quantum gedmetrodynarnics', Ann. Phys. (NY.), 2, 604-614. Wheeler, J. A. (1962) Geometrodynamics, Academic Press, New York. Wigner, E. P. (1955) 'Lower limit for the energy derivative of the scattering phase shift', Phys. Rev., 98, 145-147. „, . ¥T . ._,„_. De Witt, B. S. (1960) The Quantization of Geometry', Institute Field Physics, University of North Carolina. Yang, C. N. (1947) 'On quantized space-time', Phys. Rev., 72, 874. Uncertain Cosmology CHRISTOPHER J. S. CLARKE University of York, England 1. THE VIEWPOINT OF QUANTUM COSMOLOGY The particularity of one's presuppositions should never be underestimated. I write from a set of assumptions which to the quantum physicist may seem peculiar: that one can and should provide a coherent mathematical scheme for the entire universe; that the scheme should admit a model that represents the universe as we perceive it to be, with ourselves as observers in it; that the occurrence of such a model within the scheme can, in some sense, explain both our own coming-into-being and also the nature of what we now observe. Most interpreters of quantum theory, being concerned only with certain delineated subsystems of the universe, can with propriety assume the prior existence of the macroscopic world of daily experience as a background given in advance, in terms of which the subsystem must be explained. By rejecting this in favour of a cosmological view,* I am forced into the contentious position of using a quantum theory which includes the observer in the formalism. More than this: I wish to do it in a way consistent with a general relativistict treatment of the universe, since such a treatment uniquely combines logical elegance with observational consistency. This raises three particular problems. (1). If Newtonian space-time be abandoned, there is no reason, either physical or philosophical, to assume that the global properties of space-time should be the same as the local properties which our short-range observations reveal. In particular (as I have argued in detail elsewhere (Clarke, 1976)) the universe of general relativity may not admit any global time coordinate: if its existence (i.e. stable causality) is Certain cosmologists (Dicke, 1961 ; Collins and Hawking, 1973) have used our own existence as a constraint on cosmological parameters. But this only gives a non-tautologous explanation of the universe if the physical schema used is able to generate a reasonably restricted class of cosmological models before such a constraint is imposed; it does not absolve us from developing models which « e P°. ten tiaUy independent of our own existence. TWithin the term 'general relativity' I include all theories where space-time is represented as a C 1 manifold on which local inertial frames are specified by a metric of Lorentz signature, determined along with other physical entities by field equations which may not necessarily be (either of) those of Einstein. 51 52 Uncertainty Principle and Foundations ol Quantum Mechanics assumed then it must be recognized as an additional postulate made either in anticipation of future experimental evidence or as a temporary expedient to ease the calculations. But without the assumption of global time the traditional quantum-mechanical picture of a system evolving in time is untenable. (2). In a general space-time there is no symmetry group which will enable one to Fourier analyse a quantum field to give it an unambiguous particle interpretation. The definitions of particle number, particle creation rate, etc., are, in this situation, a matter of great controversy (Unruh, 1974), (3). Since the structure of space-time is itself a dynamical variable, and not merely a fixed arena for other events, it must itself be quantized: firstly, because its source is composed of quantized fields; and, secondly, because it seems likely that there are regions of the universe where the space-time curvature is characterized by a length-scale small enough to be in the quantum domain. But the quantization of space-time is not only technically difficult. In addition, the removal of both a fixed background space-time and a reliable particle-representation leaves very little structure on which to hang an interpretation of any formalism proposed. This third point leads one to the central difficulty of quantum cosmology: if everything is quantized— space, time, particles, observers— then everything dissolves into a structureless haze from which it is impossible to extract any semblance of concrete reality. Such a viewpoint is intimately related to the place afforded to the uncertainty relations. For, as conceived by Heisenberg (1930) 'the statistical character of the relation (between values of dynamical quantities) depends on the fact that the influence of the measuring device is treated in a different manner than the interaction of the various parts of the system on one another . . . The chain of (determinate) cause and effect could be quantitatively verified only if the whole universe were considered as a single system— but then physics has vanished, and only a mathematical scheme remains' (p. 58). The more cosmology has developed, the more this observation of Heisenberg has been confirmed, that the simple extension of quantum theory to the cosmological domain yields a mere 'mathematical scheme' that stands in need of something else before physics can emerge. For him this addition comes through alternative descrip- tions which (Heisenberg, 1974) are 'complementary' to quantum theory in being compatible with it, but not deducible from it. The course of this chapter is the pursuit of this 'something else' in a context beyond the usual laboratory one: the context of a cosmological picture containing physically extreme regions where no distinction, however arbitrary, can be made between the observer and the observed, the 'measuring device' 'and the 'parts of the system', in Heisenberg's terms. Here the use of a complementary description in the original restricted sense of Bohr (1928) is Clarke 53 impossible, and the uncertainty relations will not, as in the laboratory case, appear as a limitation on the applicability of a classical corpuscular description (Heisenberg, 1930). THE WHEELER-EVERETT THEORY AND ADDITIONAL STRUCTURE First we must examine an unusual cosmological view which seems to offer the hope of dispensing with any addition to quantum theory. Having given a detailed philosophical critique elsewhere (Clarke, 1974), I shall here sum- marize the conclusions and develop further mathematical points. The theory (Wheeler, 1957; Everett, 1957; de Witt and Graham, 1973) regards the universe as a single quantum mechanical system whose state vector ¥ undergoes a determinate Hamiltonian evolution in a Hilbert space #f. When an observation is taking place 36 decomposes into the tensor product #f M ® %?s, where 5if s describes the observed microsystem and 5if M describes everything else. Correspondingly, ¥ = ¥ M ® V s . Suppose that 1 J r s = Z,^Vi> where tyi, fa, . . . are eigenstates in which the quantity being measured has a definite value. The usual theory of measurement is followed, according to which the state ^m ® *lh evolves (approximately) into a state ^m ® «Ai, where ¥m represents the state of affairs before the measure- ment is made while ^m represents the state after a definite value, correspond- ing to t/r„ has been found. Thus the measurement as a whole is described by an evolution from ¥m®¥ s = *m® (La'<l>i) into I,, a'tfft ® fr. The Wheeler-Everett theory interprets this last state as the simultaneous existence of many copies of the universe, one for each index i. This means that at each measurement the universe splits into many branches, each branch corresponding to a separate definite value for the measurement. Every person in the universe is thus split into many copies which henceforth evolve indepen- dently of each other (by linearity). Each copy is, by construction, aware only of the (definite) outcome of the measurement in his branch so that to him it appears as if the state- vector has 'collapsed' onto an eigenstate. All the branches are equivalent: there is no need to try to attach meaning to one branch being 'more likely' than another. The statistical interpretation of quantum theory is derived, not from such an additional postulate, but from a consideration of long sequences of experiments. From this it is shown that, in the limit as the lengths of the sequences tend to infinity, in all branches of the universe except for a set of measure zero the relative frequencies of the various possible outcomes for the experiment accord with the usual quantum mechani- cal probabilistic interpretation of the a'. Now, it could well be argued that this splitting provides a mental picture that is simpler and more economical of hypothesis than that provided by the collapse of the wave packet' description of von Neumann (1955) and others. But in terms of physical verification the two approaches are equivalent, and 54 Uncertainty Principle and Foundations of Quantum Mechanics both stand in need of definite criteria which determine when it is that a measurement takes place and how the Hilbert space is to be decomposed; information that is assumed to be given ab extra in both approaches to quantum theory. But in cosmology, by definition, nothing is extra to the dynamical formalism; there are no external fields— not even the geometrical structure of space-time if, as in general relativity, this geometry is itself a quantized dynamical variable. Thus I propose the following thesis: a system of quantum cosmology, as it is usually understood, cannot contain enough intrinsic struc- ture to allow one to use criteria which might characterize the occurrence of measurement situations. * Let me clarify this by a comparison with a classical case. There the dynamical system might be represented by N functions R-»R 3 (specifying the positions of the N particles of the system as functions of time) determined by N coupled ordinary differential equations. In interpreting such a system, it suffices to take an existing structure within the mathematics (the relative positions of the particles) and match it with a corresponding observed structure. In quantum mechanics, however, the system is more usually represented mathematically by one function R -»• 3€ (the vector in Hilbert space as a function of time), again governed by a differential equation. In interpreting this it is not enough to match a structure intrinsic to the mathematics with something observable: the interpretation itself sets up further implicit mathematical structure by distin- guishing between different vectors in the Hilbert space. Note that there is in this respect no essential difference between classical and quantum systems, only a difference in the presentation. The classical system could be presented as one function R->- R 3 ", while the quantum system could be presented as an infinite set of equations for the coefficients of the state in a particular basis, and both are often found. But in quantum cosmology most of the structure beyond the specification of the Hamiltonian is ambiguous, and yet it contains virtually all the physical information. It is therefore vital to be self-conscious about the amount of structure already in the Hamiltonian, and the extent and role of the additional structure that is required. This can be illustrated in three cases. 2(a) Field Theory Let us first suppose that the quantum system is arrived at by using a background (flat) space-time. The customary procedure would be to set up a Fock-space or equivalent representation of the free fields, and then introduce some interac- tion. For a free field the pair (5?, H), where W is the Hilbert space and H the Hamiltonian, conveys almost no information at all. The only intrinsic structure for such a pair is given by the spectral type of H (Plesner, 1969), and a free-field Hamiltonian has merely a homogeneous spectral type covering the positive reals with multiplicity d (the cardinality of the integers). The system only acquires some physical content when a particular Fock-space decomposition of $? is specified. Clarke 55 When it comes to interacting fields, the main problem is one of establishing what $f is by some renormalization procedure. In the absence of any rigorous treatment of this, one can only speculate as to the structure of (Sif, H); but it would be surprising if it differed at all from the free-field case. In actual practice very little interest is paid to the nature of $f, the stress being entirely on the operator algebras derived from the fields. By virtue of their interpretation, these have a very rich implicit structure. But in cosmology it is the states which must be related to the universe we observe, since there are no external measurements to correspond to 'observables'; and so I shall concen- trate on structure as it is manifested in the set of states. 2(b) Infinite Fock Space The foregoing must be modified for cosmological application, in view of the likelihood (Schramm and Wagoner, 1974) of the universe being spatially infinite. Then it is more realistic to allow the space of 'states' to include descriptions of infinitely many particles (and not merely unboundedly many, as in Section 2(a)). A complication then arises when Fermi or Bose-Einstein statistics are needed because the appropriate 'state'-space 5if* is not a subspace of the infinite tensor product $f°° of one-particle states 5T=® (%) <0 but a subspace* of its algebraic dual. Sif* is in fact not a Hilbert space; but a Hamiltonian is defined in it as the dual of an appropriately defined operator on the Hilbert space 5Jf°°, and the arguments of Section 2(a) apply to this. Because 3€* is not a Hilbert space, its elements cannot be regarded as interpretable states. It has, however, a concept of orthogonality and hence of orthogonal projection. A (mixed) state can then be defined as a subadditive and orthogonally additive function of the projectors into [0, 1]: there are no pure states. As it does not seem to be known whether there is a representation theorem for this situation (of the type given, for instance, by Langerholc (1965)) one cannot pursue the matter much further. I shall assume that it will ultimately be possible to deal with these states in the same way as mixed states in the separable-Hilbert-space, to which I now turn. 2(c) Mixed States Suppose one has the situation in Section 2(a), except that now the triple (Sif, H, <t> ) is given, where <J> is a mixed state (a self-adjoint operator of trace Explicitly Sif* is the annihilator of the subspace of 5if°° consisting of all finite combinations of vectors of the form e a) ®...®e 0, ®...®e 0) ®...±e <1) ®...®e (o ®...®e </) ®... 1 i/l / i where / (t) denotes/ as a member of (X f k \ the fcth copy of X and the + (resp. -) sign gives Fermi fresp. Bose-Einstein) statistics. 56 Uncertainty Principle and Foundations of Quantum Mechanics class) representing the universe at some initial time t . There is now room for considerable complexity since H and 3> will not in general commute; but I shall argue that interpreting quantum theory by using only this information, while not impossible, is highly unsatisfactory: the available structure cannot provide an acceptable basis for decomposing a vector into a superposition of macro- scopic states, as is required by any interpretation, including that of Wheeler and Everett. At first glance an obvious procedure presents itself. Suppose that we want to give an intrinsic characterization of the measurement process described in Wheeler-Everett terms at the start of this section, where 3€ splits* into #fi © #f 2 and <& decomposes accordingly into <D = (0„ + $12) ° Pi + (*2i + #22) ° P2 (where 3> iy : 3Sfj -»■ % ^ <= X, P, : X -* %). It is a characteristic of the measurement situation (Daneri, Loinger and Prosperi, 1962) that in these circumstances statistical processes ensure that ||<t»i 2 || = ||<J»2i||-> 0, while the remaining compo- nents are H-stable in that [<J>n, H] maps (approximately) from 5if x to #fj and similarly for #f 2 . Thus we can try to identify ^ and #f 2 by diagonalizing <J> and then looking for partitions of the set of eigenvectors into subsets which each span subspaces stable under H. However, this procedure cannot separate out the subspaces $?i and $f 2 if the probabilities of the two corresponding experimental results are almost equal- as may happen if there is an exact symmetry between the two microscopic end states of the system being observed. In that case the nature of the diagonaliza- tion will be heavily influenced by the residual off-diagonal terms <&i 2 , which would cause the procedure to select as possible outcomes states which should in fact be regarded as unacceptable superpositions of macroscopically distinct states. The difference between the values of the various probabilities involved (the eigenvalues of 3>) is not the essential criterion in distinguishing possible outcomes. That criterion is the macroscopic dissimilarity of the various pos- sibilities, and the diagonalization of <& is linked with it only in certain cases. In general one must recognize the existence of some fundamental structure which corresponds to this dissimilarity, and is not equivalent to any intrinsic property of(%,H,&). 3. COSMOLOGY WITH STRUCTURE Let us suppose that as well as a Hubert space and a Hamiltoniant we have also as additional structure a particle decomposition, in the form of a Fock-space representation of 5if (supposing, for simplicity, a finite but unbounded number *Here STj and 3Sf 2 correspond to two different outcomes to the measurement. tf or the sake of definiteness, and to remain with familiar territory, I have phrased my account in terms of a conventional (Sif, H) formalism; but this is by no means essential, and is probably undesirable. Clarke 57 of particles). This might seem very drastic, but it is hard to see how one can get away with any less. One consequence of this is that the problem of defining particles in curved space in _ order to study particle creation in cosmology (Parker and Fulling, 1972) is now reversed: particles are supposed given at the outset, and the problem is now to define the space in which they are situated. Instead of particle-creation one has space-annihilation. Although particles may be part of the a priori structure, they cannot directly give the sort of information needed to define macroscopic states and measure- ments. The universe cannot always be split into apparatus and microsystem along particle lines, since particle number could be one of the dynamical quantities being measured. The process of defining macroscopic states must be two-stage: first, the particle structure has to define a spatial structure; then criteria for macroscopic interpretability have to be formulated in terms of spatial structure. The basic ideas for passing from particles to space are fairly simple, and limited progress has been made in their formal articulation. Consider an N-particle state of the form <5(fa ®fa®... ®<^jv) where © denotes (anti-) symmetrization. We may try to associate a 'distance' between fa and fa by examining \(fa\H\fa)\ 2 ; the greater this quantity the greater the probability of interaction between fa and fa and so the closer their 'distance'. The develop- ment of this idea (Penrose, 1972) has actually been in terms of models where there is no explicit Hamiltonian, but only quantities which may be thought of as scattering amplitudes. In the simplest case a geometry can be defined in terms of these quantities, which is a geometry of Euclidean directions. An important feature is that if one first sets up the amplitudes by using a conventional description, based on particle states which do not define precise directions, then the geometry which is deduced is still a normal Euclidean direction- geometry, but one that is not related in any simple way to the space with which one started. On this approach particles come first, and the 'real' space is the one which they define. One can speculatively indicate the form which a cosmology based on this might take. An N-particle state, as above, could be examined to establish a rough criterion of 'nearness', or locality. Then a more detailed geometry could be defined which held locally, in the sense that Minkowski geometry holds locally in general relativity. Still working in terms of a group of nearby particles, states would certainly be regarded as macroscopically distinct if there was no isomorphism of the geometry which made a suitably smoothed-out particle- density for the two states coincide, even approximately. (More precisely, this would give a criterion for the distinguishability of two mixed states, each corresponding to specific states for a local group of k of the N particles, the remainder being unspecified.) Finally, transition probabilities would have to be specified between states which were macroscopically interpretable (i.e. which were not superpositions of distinguishable states). This would be done using either a Hamiltonian formalism, or combinatorial laws which arise more naturally in the twistor theory development already cited (Penrose, 1972). 58 Uncertainty Principle and Foundations of Quantum Mechanics Note one advantage of this emphasis on mixed states: one can pass with little modification to the infinite inverse described in Section 2(b), where there are no pure states. If the theory were to progress along lines like these, one can discern three aspects which would prove especially interesting. 3(a) Renormalization Infinities arise both from the unboundedly large momenta that occur in loops, and from the unboundedly large number of intermediate particles that can appear in the perturbation expansion for an interaction. The first divergence, with which renormalization concerns itself, is an essential part of the usual space-time descriptions used and of the field-theory approach of adding interactions onto a basically free-field structure. The theory I am envisaging must start with a system of finite matrix elements, i.e. it must already be renormalized. This is another point in favour of twistor theory, which automat- ically yields finite answers to scattering problems (Penrose, 1975). The second infinity (caused by non-convergence of perturbation theory expansions) is unlikely to arise, because particle density is automatically limited in a theory which places particles before space. As more particles appear, so more space appears, since space is simply the numerical relations between the particles. 3(b) Local and Global Aspects In general relativity it is assumed that the local aspect of the universe to which we have immediate access is similar to all other local aspects, and that these local glimpses can be pieced together into a global model. In the quantum case there seems to be no reason why this should be so. That is to say, while the theory refers to the universe as a whole, there is no reason to suppose that there should be globally defined macroscopic states and transition probabilities. The desire for this, which haunts much of current work in quantum cosmology, stems from the mistaken attitude of viewing the universe from the outside, as we would observe an atom in a crystal. In reality we are part of the universe and the states with which the scientist is concerned are states relative to himself. All our observations are local, at least in the sense that the domain of galaxies over which they extend is one in which the geometry departs only modestly from the Euclidean; and, on a purist view, they are very local, in that we are concerned directly only with photons and particles which arrive here on earth, and arguments as to their source are a matter of indirect inference. So we should not be disappointed if we fail to obtain a god-like view of the universe as a whole. What we can demand is that which is scientifically testable: a theory which enables us to predict and understand future observations in Clarke 59 terms of present ones. In practice, of course, one would make use of the global understanding of the universe which we think we already have: a mixed state in which a local group of particles only is specified can be regarded as a mixture of globally specified states, each of which can be analysed by analogy with a conventional cosmology. But since our present observations do not single out any unique cosmological model, many global models will be compatible with them; some having a global time, some being acausal and so on. All these participate in the mixed state which is defined by our observations, and define the states into which it may, probabilistically, turn. The basic structures of the theory are global and comprehend the entire universe. But their translation into observation requires the selection of some particular viewpoint, in the form of a local group of particles small enough to enable a spatio-temporal structure to be defined. The 'additional structure' in the theory specifies the totality of possible viewpoints, one of which is ours. 3(c) Uncertainty The probabilistic nature of the predictions which emerge could, if one wished, be ascribed to some kind of complementarity between the Hilbert space description and the macroscopic-state description. I would see this as unneces- sarily dualistic, preferring to regard the Hilbert space structure as a kind of scaffolding on which to hang and manoeuvre the macroscopic states, and to use to calculate the transition probabilities. The physics is a physics of the macro- scopic states, and the relations between them are by their nature probabilistic. This enables us to return to the uncertainty relations in a cosmological setting, when they become relations constraining an intrinsically probabilistic scheme. They entail limitations on our measuring abilities, rather than being conse- quences of limitations, because the probabilistic structures of which these are an instance are there constraining the physics of the universe even if nothing is happening which could conceivably be called a measurement. In short, the indeterminate physics which is uncovered in our laboratories is simply an aspect of the entire uncertain cosmology of which it is a tiny part. REFERENCES Bohr, N. (1928) 'The Quantum Postulate and the Recent Development of Atomic Theory', Nature, 121, 580-590. Clarke, C. J. S. (1974) 'Quantum theory and cosmology', Phil. Sci., 41, 317-332. Clarke, C. J. S. (1976) 'Time in general relativity', To appear in Minnesota Studies in the Philosophy of Science. Collins, C. B. and Hawking, S. W. (1973) 'Why is the universe isotropic?' Astrophys. J., 180, 317-334. Daneri, A., Loinger, A. and Prosperi, G. M. (1962) 'Quantum theory of measurement and ergodicity conditions', Nucl. Phys., 33, 297-319. De Witt, B. S. and Graham, N. (Eds.) (1973) The Many -Worlds Interpretation of Quantum Mechanics, Princeton University Press, Princeton. 60 Uncertainty Principle and Foundations of Quantum Mechanics Dicke, R. H. (1961) 'Dirac's cosmology and Mach's principle', Nature, 192, 440-441. Everett III, H. (1957) ' "Relative state" formulation of quantum mechanics', Rev. Mod. Phys., 29, 454-462. Heisenberg, W. (1930) The Physical Properties of the Quantum Theory (trans. Eckart, C. and Hoyt, F. G), University of Chicago Press, Chicago. Heisenberg, W. (1974) 'Double dialogue', Theoria to Theory, 8, 11-34. Langerholc, J. (1965) 'The trace formalism for quantum mechanical expectation values', /. Math. Phys., 6, 1210-1218. Penrose, R. (1972) 'On the nature of quantum geometry', Magic without Magic: John Archibald Wheeler, Klauder, J. R. Ed., W. H. Freeman and Co., San Francisco. Penrose, R. (1975) Quantum Gravity, Isham C. J., Penrose, R. and Sciama, D. W. (Eds.), Oxford University Press, Oxford. Parker, L. and Fulling, S. A. (1972) 'Quantized matter fields and the avoidance of singularities in general relativity', Phys. Rev. D, 7, 2357-2374. Plesner, A. I. (1969) Spectral Theory of Linear Operators, Vol. II, Ungar, New York. Schramm, D. N. and Wagoner, R. V. (1974) 'What can deuterium tell us?' Phys. Today, 27, (12), 4CM7. Unruh, W. G. (1974) 'Alternative Fock quantization of neutrinos in flat space-time', Proc. Roy. Soc. London A, 338, 517-525. von Neumann, J. (1955) Mathematical Foundations of Quantum Mechanics, trans. Beyer, R. T., Princeton University Press, Princeton. Wheeler, J. A. (1957) 'Assessment of Everett's "Relative State" Formulation of Quantum Theory', Rev. Mod. Phys., 29, 463-465. 7 Uncertainty Principle and the Problems of Joint Coordinate-Momentum Probability Density in Quantum Mechanics V. V. KURYSHKIN Peoples' Friendship University, Moscow, U.S.S.R. 1. INTRODUCTION The 50-year old history of the development of quantum mechanics has been extremely rich in attempts »to reconsider its interpretation, to alter its mathematical formalism and finally to create a new theory that would provide a more complete description of physical reality than the one offered by quantum mechanics. Among the investigations conducted in this field are those devoted to the search for the singular solutions of the equations of quantum mechanics (De Broglie, 1956; De Broglie and Andrade e Silva, 1971)rthe search for particle-like solutions of the non-linear field theory (Finkelstein and cowor- kers, 1956; Glasko and coworkers, 1958; Rybakov, 1974); the attempts to introduce all kinds of 'hidden' parameters (Bohm, 1952; Pena-Auerbach and coworkers, 1972); the realization of various stochastic approaches to quantum mechanics (Fenyes, 1952; Bess, 1973); the attempts to explain quantum phenomena by the existence of an 'imaginary' or 'hidden' thermostat, 'sub- quantum medium' (Bohm and Vigier, 1954; Terletsky, 1960; De Broglie, 1964). The authors of such investigations usually proceed from the assumption that generally accepted quantum mechanics does not completely describe the physical reality and that it is possible to create a more profound theory which would treat all experimentally measurable quantities as simultaneously exist- ing physical realities. The incompleteness of the quantum-mechanical description was implicit in the earliest works of the founders of quantum mechanics (De Broglie, 1927), and since 1935 it has been a kind of an accusation against the fully established quantum mechanics (Einstein and coworkers, 1935; Schrodinger, 1935). How- ever, the thesis of the incompleteness of the quantum-mechanical description remains unproved so far. This is because all the proofs of the incompleteness can easily be refuted on the grounds that quantum mechanics, owing to the 61 62 Uncertainty Principle and Foundations of Quantum Mechanics well-known Heisenberg uncertainty principle, (1) rejects the concept of the coordinate and the momentum of a system existing simultaneously as physical realities. But the statement of the completeness of the quantum-mechanical description remains an assumption that has not been proved either. This stimulates a search for a new theory, more profound than that of quantum mechanics; the existence of such a theory has not been doubted by many outstanding physicists (Einstein, 1948; De Broglie, 1953; Schrodinger, 1955). In the construction of the above-mentioned profound theories the principle of the correspondence between the sought-f or theory and quantum mechanics plays a major part. In the opinion of most investigators this correspondence means that the new theory must explain the fundamental propositions of quantum mechanics as of certain statistical theory which appears when the completeness of the description of physical reality is partially sacrificed (when certain statistical averaging is undertaken). Thus, quantum mechanics lays quite definite claims to the sought-for profound theory. It is quite natural that the authors of different profound theories are anxious to satisfy these claims first and foremost. In its turn the assumption that the sought-for profound theory exists lays certain claims oh quantum mechanics itself. This circumstance is usually neglected by most investigators. Meanwhile, it is the main obstacle in the way of creating a profound theory. This can be illustrated by the following reasoning. Quantum mechanics in spite of its obvious and generally acknowledged statistical character is not a theory of the consistent probability nature. It does not make use of any joint probability distributions for physical quantities, for example for coordinate and momentum, it defines no conditional prob- abilities. This fact leads to no contradictions within quantum mechanics since the quantum-mechanical description does not require that all physical quan- tities be considered as simultaneously existing realities. Let us assume now that there exists a theory giving a more complete description of physical reality than quantum mechanics and treating all quan- tities as simultaneously existing physical realities. Let this theory with the completeness of the physical reality description partially sacrificed (certain statistical averaging employed) lead to a statistical theory, coinciding with quantum mechanics. But renouncing the completeness of the description and resorting to probabilities in this theory we shall inevitably arrive at a statistical theory, in which along with the probability of values of the physical quantity A i and the probability of values of the quantity A 2 there will exist a joint probability of the values of quantities A x and A 2 the probability of values of the physical quantity A provided that A x has a certain definite value A[, i.e. the Kuryshkin 63 conditional probability. In other words, the statistical theory thus obtained, and coinciding in accordance with the tentative assumption with quantum mechanics, will inevitably follow the conventional probability scheme. Hence, if the sought-for profound theory exists, then the concepts of joint and conditional probabilities can be introduced into quantum mechanics, i.e. quantum mechanics may be reduced to a consistent probability scheme. Attempts to introduce the concepts of the joint probability density for various physical quantities and in the first place for a certain joint coordinate- momentum distribution (quantum distribution function, QDF) have been made repeatedly. The earliest works in this field (Wigner, 1932; Terletsky, 1937; Blokhintsev, 1940) did not aim at introducing the QDF into quantum mechanics and considered the proposed phase space functions only as possible mixed representations of the density matrix, which later proved extremely useful in concrete quantum-mechanical problems (Klimantovitch and Silin, 1960; Imre and coworkers, 1967; Arinshtein and Guitman, 1967; Gorshenkov and coworkers, 1973). It was only in 1949, that an attempt to interpret Wigner's function as a QDF was made apparently for the first time (Moyal, 1949). However, Moyal's statistical interpretation of quantum mechanics did not gain much support, since the sign-variability of Wigner's function prevents it from being treated as the joint coordinate-momentum probability density. In subsequent years a few more concrete functions that might be considered as QDF were suggested (Bopp, 1956; Margenau and Hill, 1961; Mehta, 1964; Cohen, 1966a, b; Shankara, 1967; Kuryshkin, 1968; Ruggery, 1971; Zlatev, 1974). Except for Bopp's function they all turned out to be sign-variable. Besides, investigations showed (Mehta, 1964; Cohen, 1966; Kuryshkin, 1968) that the choice of any of these functions for part of the QDF requires a certain correspondence rule (the rule of constructing quantum operators), which does not coincide with the rule (Neumann, 1932) used in quantum mechanics. In other words, the proposed QDF should be treated as no more than phase-space representation of the density matrix (Imre and coworkers, 1967; Kuryshkin, 1969a, b; Ruggery, 1971; Gorshenkov and Kognkov, 1973). Finally, in 1966 it was proved (Cohen, 1966a, b) that in the generally accepted quantum mechanics, whose operators satisfy Neumann's requirements (Neumann, 1932; Shewell, 1959) the QDF was non-existent, not only the non-negative QDF but the sign-variable one as well. Thus, the concept of the joint coordinate-momentum probability density cannot be introduced into generally-accepted quantum mechanics, i.e. the generally accepted quan- tum mechanics cannot be reduced to a consistent probability scheme. This conclusion was also formulated and discussed in a number of other works (De Broglie, 1964; Andrade e Silva and Lochak, 1969; Kuryshkin, 1974). Thus, the generally accepted quantum mechanics compels us: (a) either to reject the assumption of the existence of a theory that can provide a more complete description of physical reality or (b) assuming that such a theory does exist, to question the validity of the generally accepted quantum mechanics itself. 64 Uncertainty Principle and Foundations of Quantum Mechanics Therefore, while favouring the search for a profound theory, it is necessary in the first place to reconsider the generally accepted quantum mechanics altering it so as to introduce into it a non-negative QDF interpreted as the joint coordinate-momentum probability density. Such alterations, naturally, must not lead to the violation of those propositions that can be experimentally checked. Heisenberg's uncertainty principle, which is a fundamental and indispensable law of quantum theory begins to play a very important part in this case. This is because correlation (1) forbids the physical system states with coordinate and momentum strictly determined, while any attempt to introduce a joint coordinate-momentum probability into the quantum theory is equiva- lent to an implicit assumption that a physical system can possess a definite momentum with a quite definite coordinate. The principal possibility of altering quantum mechanics with the view of introducing QDF into it was shown in the works of the author of this paper (Kuryshkin, 1971, 1972a). Such alteration was based on the fact that the problem of constructing operators 0(A) of physical values A in quantum mechanics has not been completely solved. The generally accepted quantum mechanics makes use of operators, satisfy- ing a set of requirements, called the Neumann rule (Neumann, 1932; Shewell, 1959). However, as far back as 1935 it was shown that this rule is not single-valued and the attempts to get rid of that disadvantage lead to inner contradictions (Temple, 1935a, b). Other known correspondence rules (Born and Jordan, 1925; Dirac, 1958; Weyl, 1950; Tolman, 1938; Rivier, 1951; Yvon, 1946; Kuryshkin, 1968; Kerner and Sutsliffe, 1970) also suffer from a number of drawbacks (Shewell, 1959; Groenewold, 1935; Kuryshkin, 1969b; Cohen, 1970). It must be noted that all known correspondence rules, generally speaking, agree only in the statement that: 0(q,)0(p r ) - 0{p r )0(qj) = ih 8 ir (2) Commutator (2) in the long run results in correlation (1). The works criticizing the well-known correspondence rules made it possible, to formulate a number of requirements to the 'uncontradictory' rule and to construct it (Kuryshkin, 1971). The application of this rule to the construction of quantum operators has led to a theory, named 'quantum mechanics with a non-negative QDF' (Kuryshkin, 1972a). To date this theory has been studied fairly fully (Kuryshkin, 1972b, c, 1973; Zaparovany, 1974; Zaparovany and coworkers, 1975). In this paper therefore, we will mostly concentrate on the principles of constructing theories of a quantum mechanical type possessing a non-negative QDF as well as on a brief analysis of certain concepts distinguishing these theories from the generally accepted quantum mechanics. It should apparently be stressed once again, that our concern will not be to offer another interpretation of the generally accepted quantum mechanics but tp construct some new statistical theory which will comprise a major part of the quantum mechanical mathematical formalism and which will have a consistent Kuryshkin 65 probability character, and can, therefore without contradiction be considered as a statistical theory for the would-be more profound theories. In order to pay maximum attention to the physical sense and not to burden our paper with a lot of mathematical formulae, we will consider one-body physical systems only [coordinate r = (r u r 2 , r 3 ), momentum p = (p u p 2 , p 3 )] and pure quantum-mechanical states represented by vector i/>. The task of the generalization of everything that follows for the case of many-body systems as well as for mixtures represented by density matrix p, presents no difficulties (Kuryshkin, 1972a, 1973; Zaparovany and Kuryshkin, 1975). 2. INITIAL POSTULATES AND THEIR COMPATIBILITY In order to construct the most general class of statistical theories resembling quantum mechanics by their mathematical formalism and containing the non-negative joint coordinate-momentum distribution, treated as a phase- space probability density, let us proceed from the assumptions. 1. Interpretation Postulate The state of a physical system at any instant of time t is completely described by a joint coordinate-momentum probability density F(z, p, t) ; the physical quan- tity A can be represented by coordinate-momentum-time functions A (r, p, i) and the experimentally observable value (A) of the physical quantity A for a system in an F-state is defined as: <A) = J A(r,p,t)F(r,p,t)drdp (3a) From the physical meaning of F, defined by this postulate, follow its essential properties: F(r,p,t)drdp = l F(r, A f)>0 (3b) (3c) This postulate is only a common statement of the classical statistical theory and in the case of S -like distributions, of classical mechanics as well. However, considering the coordinate and the momentum as simultaneously existing physical realities, we extend this statement to physical systems possessing quantum properties. The question of the compatibility of this postulate with the general property of quantum systems, expressed by Heisenberg's uncer- tainty principle, remains open so far. 66 Uncertainty Principle and Foundations of Quantum Mechanics 2. Mathematical Formalism Postulate The state of a physical system at any instant of time t is completely described by a normalized vector |i/Kf)) °f some states' space if; any function of phase space and time A (z, p, t) owing to a certain linear rule can be represented by a linear operator 0(A) in if; and the experimentally observable value (A) of the physical quantity A for a system in a |t/r) state is defined as: <A> = <*(0|O(A)|*(0> (4a) where (t/'ili/^) is a scalar product of vectors j«^i> and |i/f 2 ). The normalization requirement for the vector state, the linearity of operators and the linearity of the correspondence rule is understood as usual: <*(0I*(0>=1, O(A){a|0» = aO(A)|*> 0(A){\^) + |e/r 2 >} = 0(A)|<h> - 0(A)|<fe> . o(i) = i 0(aA) = aO(A) 0(A 1 +A 2 ) = 0(A l ) + 0(A 2 ) (4b) (4c) (4d) where a is a numerical coefficient, 1 is the unit operator in if. The second postulate is practically a slightly paraphrased basic postulate of quantum mechanics. But in contrast to the generally accepted quantum mechanics the distinct forms and properties of the operators (with the exception of linearities (4c) and (4d) remain undefined here. It is essential in the first place to prove the compatibility of the above- formulated postulates, i.e. to show that equations (3) and (4) do not contradict « each other. With this purpose in view let us introduce into consideration the characteristic function F(«,M) = (2ir) 6 \F(r,p,t)e —i(ur+vp) dr dp (5) Expanding the exponent into a series and using correlation (3a) the characteris- tic function can be rewritten in the form: F{u, v, t) = (2tt) 2, ■ - (r j 1 nil m 3 ! 'P? 3 ) (6) where n = {n u n 2 , "3), m = (mi, m 2 , m 3 ) are integer vectors. Reconstructing now the probability density F from the characteristic function F by the reverse I Kuryshkin 67 transformation of the integral (5) and using equations (6) and (4a), we obtain: F(r,p,t) = (2TrT 6 Z \dudve Hur+vp) n,m J (-;«!)"• • . . . • -(i» 3 ) n^ • ...-m 3 ! -<</Kf)|O(r?"...-pr)|</K0> (7) Taking into account linearity properties (4c) and (4d), equation (7) can be rewritten as: F(r,p,t) = Mt)\fi(r,p,t)\*(t)) (8) where F(z, p, t) is linear in the 5£ operator and parametrically dependent on coordinate, momentum and time. The form of operator F, or rather its relation to operators 0(r" ' • . . . • p? 3 ), is defined by equations (7) and (8). The depen- dence of F on time in a general case is caused by the fact that the mathematical formalism postulate does not rule out the possibility of the time-dependence of operator 0(A) even if the corresponding function A (r, p) does not depend on time. Let us turn now to the physical meaning of operator F and its properties. Substituting relation (8) into equation (3 a) and comparing it with (4a) we will see that operator F completely determines the correspondence rule: 0(A) = J A ( r, p, t)F(r, p,t)dr dp (9) Integrating equation (8) over phase-space and taking into account the normalizations (3b) and (4b) we obtain: ^F(r,p,t)drdp = l (10) At last from property (3c) and relation (8) it follows that: <*|Ar,p,f)l*>2=0 (ID i.e. F is an operator positively determined in if. Thus, a theory, satisfying the two initial postulates, contains a linear operator F( r, p, t), positively determined in if, parametrically dependent on coordinate, momentum and time, and normalized by condition (10). The physical meaning of operator F is defined by relation (4a) of postulate 2 and equation (8), i.e. F(r, p, t) is the operator of the probability density of coordinate r and momentum p at the instant of time t. One can easily determine the phase-space function /(r, p, t) corresponding to the operator F in agreement with the correspondence rule (9). Indeed, writing down the probability density operator as F(£, v ,t)=0(f(t,r 1 ,r,p,t)) (12) where $ and 17 are the parameters of operator F and the function f(r,p,t), from 68 Uncertainty Principle and Foundations of Quantum Mechanics relations (7) and (8), determining operator F, we obtain: f(€,V,r,p,t) = S(€-r)8(ri-p) (13) Here 8(x) is Dirac's three-dimensional 8 -function. Hence, the coordinate- momentum probability density operator corresponds to the phase-space 8- function. This conclusion is in full agreement with correspondence rule (9) and the physical meaning of operator F. Let us consider now the problem of the dynamics which are permissible in a theory which satisfies the initial postulates. Let \<p{t)) and |<K0) determine the states of a physical system at times t and t ' s t, respectively. Since, in conformity with postulate 2 both these vectors belong to the same space !£, they can always be related by the transformation: |W)> = $(*', *)!*«>, S(t,t) = l (14) where S(t', t) is a linear operator, parametrically dependent on t' and t. Assuming in (14) that t' = t + 8t, in the limiting case when St -*■ 0, we obtain: a|*(0> dt =Mt)\*(m (15) where X(t) is the linear operator parametrically dependent on t and related to operator S by the following equation X(t) = dS(t', t) dt' The fact that the permissible dynamic equation (15) contains only the first derivative of the vector state with respect to time is an immediate consequence of postulate 2: vector |0(f)) completely determines the state of a system and the knowledge of it is a sufficient condition for finding the vector state |^(r')> at any instant of time t' > t. Since the operator X determines the evolution of the system in time it must be related to certain physical quantities. And since X is linear [the consequence of postulate 2] and since any physical quantity can be represented by a coordinate-momentum-time function [the requirement of postulate 1], operator X by the correspondence rule (9), is related to a certain phase-space function, which can, in a general case, be complex i.e.: X=0(X), X(r,p,t) = Q(r,p,t)-iR(r,p,t) (16) Here Q and R are real functions. Let us take into account now the normalization condition (4b), Since the normalized vector |t/r(0> due to equation (15) must automatically result in the normalized vector \>p{t)), we have: a<«M')l<K0> at -=<*|*|*>+<*|tf» = Kuryshkin 69 Since the initial state |t/r(0> can be arbitrarily chosen, it follows that 0(X) = -0 + (X) (17) where + {X) is an operator in <£ conjugated to operator 0(X). Since the correspondence rule (9) is linear and owing to the properties of operator F (i.e. it gives self -conjugated operators for real functions) by substituting (16) into (17) we obtain: 0(Q) = 0, -iO(R) = 0(X) (18) The dynamic equation (15) will in this case take the form: dt -=0(R)\Ht)) (19) where R(r, p, t) is a real function. Hence, a theory, satisfying the two initial postulates, contains a real coordinate-momentum-time function R(r,p, t) (a dynamic function) which, with the help of the corresponding operator O(R) and equation (19), defines the evolution of a physical system in time. By differentiating (8) with respect to t and using equation (19) it is possible in principle to obtain an equation for the probability density F(z, p, t) as well. It will also contain only the first time derivative which agrees with postulate 1 . But in order to determine the distinct form of this equation one must be able to reconstruct the function R (/-, p, i) from the operator 0(R), i.e. to knowthe distinct form of the coordinate-momentum probability density operator F. 3. GENERAL PRINCIPLES OF CONSTRUCTING QUANTUM MECHANIC-LIKE STATISTICAL THEORIES WITH CONSISTENT PROBABILITY INTERPRETATION The results (9)-(19) of the investigation of the compatibility of the two postulates made in the previous section make it possible to formulate the general principle of constructing the theories in question. In order to construct a quantum mechanic-like statistical theory with a consistent probability interpretation it is necessary and sufficient to use the following procedure: (1). To represent the physical quantities as coordinate-momentum-time functions A(r,p,t) (20) (2). To choose a space !£ of the physical system's vector states \\$>) (21) (3)- To indicate a linear probability density operator F(z, p, t) positively determined in !£ and parametrically dependent on coordinate, momen- tum and time and normalized by the condition (22) F(r,p,t)drdp = l (22a) 70 Uncertainty Principle and Foundations of Quantum Mechanics (4). To indicate a real dynamic function R(r,p, t) of the phase-space and time, responsible for the evolution of the system (23) The necessity of some solution of the above-listed problems was shown in the previous section. Its sufficiency can easily be demonstrated in the following way. Assume that the problems (20)-(23) are in a certain way solved. Then bringing into correspondence to any function A (r, p, t) the linear in ££ operator 0(A) = J A (r, p, t)F(r, p, t) dr dp (24) and representing the physical state of the system by a normalized vector |(H0) e ££ satisfying the equation at (25) let us determine the values (A ) of the physical quantity A in the state \i(t) as the scalar product <A> = <*|0(A)|*> (26) Correlations (24)-(26) now represent a quite definite enclosed theory both from the point of view of statistics and of dynamics. The correspondence rule (24) and operators 0(A) of such a theory possess the linear properties (4c) and (4d). A mere substitution of operators (24) into equation (26) results in the redetermination (3a) of the physical quantity values; defines the function F(r,p, t) in accordance with correlation (8) and its properties (3b)-(3c); and therefore the only possible in such theory interpretation, i.e. F(r, p, t) is the coordinate-momentum probability density. Finally, knowing the coordinate-momentum probability density F(r,p,t) and the function rep- resentations A(r,p, t) of physical quantities in this theory it is possible to calculate joint and conditional probabilities for any physical quantities by the conventional methods of the probability theory. The theory so obtained is consequently of a consistent probability character. It is quite natural then, that the properties of the statistical theory thus obtained, and its results in the first place, will depend on the concrete solution of the problems (20)-(23). The natural questions arising from it are: What is the difference between these theories and the generally accepted quantum mechanics? Can this theory with some concretization of the problems (20)-(23) describe physical reality? And if so, in what way can this concretization be found? In the sections that follow, we shall try to discuss these questions, omitting for brevity's sake all mathematical calculation. 4. MAIN CONSEQUENCES OF THE STATISTICAL THEORY UNDER CONSIDERATION t The main advantage of the statistical theory, whose methematical apparatus was given in the previous section is that it is of consistent probability character i Kuryshkin 71 and at the same time does not violate the basic postulate of the quantum theory. The concrete form of such statistical theory, its properties and its results depend on the a priori solutions of the problems (20)-(23). However, irrespec- tive of this concretization a number of general theoretical consequences can be pointed out, amongst which are the following: (1). No concretization of problems (20)-(23) makes this statistical theory coincide with the generally accepted quantum mechanics. This is quite obvious, since the generally accepted quantum mechanics would other- wise be reduced to the consistent probability scheme. (2). In the statistical theory under consideration Neumann's requirement is in a general case violated, i.e. 0(f{A))*f(0(A)) The operator of the square of a physical quantity is not equal to that quantity's operator squared which can be written as: 0(A 2 ) = 2 (A) + 2(A) (27) The linear operator 2(A) defined by the relation (27) depends on the concrete form of the probability operator F and, in a general case, is not equal to zero for all physical quantities. (3). In the theory proposed here, as in any statistical theory, the physical quantity's value (A) in the state |«A> is characterized by the uncertainty (dispersion) ((AA) 2 ) = <A 2 )-«A)) 2 = <(A-<A» 2 ) = <(O(A)-<A» 2 >+<®(A))>0 (28) whose non-negativeness is guaranteed here by the consistent probabil- ity character of the theory. In contrast to the generally accepted quantum mechanics, however, the value (A) of the quantity A here in the states with an eigenvector of O (A) in a general case is not strictly determined. Thus, if 0(A)|^> = «|^> (29) where a is the eigenvalue, coinciding with (A), then from (28) it follows that {(AA)X = bl> a \2(A)\+ a )^0 (30) Hence, over the eigenvectors of operator 0(A) the operator 2(A) is non-negative and has the sense of a dispersion operator. (4). If in a certain state |«^> the dispersion of quantity A reaches its minimum value in the sense that <(AA) 2 W><(AA)% where |S^> is an arbitrary infinitesimal deviation of the vector state, then 72 Uncertainty Principle and Foundations of Quantum Mechanics |(/f) satisfies the non-linear equation {0(A 2 )-2aO(A) + a 2 }\^) = d 2 \tlf) (31) wherea = <(^|0(A)|iA>. (5). In the statistical theory under consideration the precision of determin- ing the value of even a single physical quantity is limited by the inequality ((AA) 2 )>(8A) 2 = mm {dl} (32) (n) where d 2 n are the eigenvalues of equation (31). The uncertinties 8A are finally determined by the probability density operator and may change with time. (6). For the uncertainties of two physical quantities A and B in any state \iff) there exists a correlation {<(AA) 2 >-<2>C4)>} • {<(A2?) 2 >-<®(B))}>||<C>| 2 (33) where C=[0(A), 0(B)]. Inequality (33) represents the uncertainty principle in the proposed statistical theory. (7). In the case when [O(A z ),O(A)]_ = (34) is valid for a quantity A, the eigenvectors of equations (29) and (31) coincide. Therefore, providing that equality (34) is fulfilled the states with the most precise values of a quantity A (minimum dispersion) are defined, as in the generally accepted quantum mechanics, by the operator 0(A) eigenvalue equation. If the commutation condition (34) is fulfilled both for quantity A and for quantity B, the uncertainties correlation (33) takes the form: <(AA ) 2 )((AB) 2 ) ^|<£>| 2 + (8A f(8B) 2 (35) (8). In a similar way to the generally accepted quantum mechanics all probability characteristics of a physical system in the theory investigated are determined by the state \tff(t)), the probability density of any physical quantity in the state |^(f)> being given by the expression W(A, t) = <*(*)| I 8(A -A(r, p, t))F(r, p, t) dr dp|*«> (36) Here, however, the vector \\fi) does not generally have a distinct physical sense and can be considered as only a mathematical image of the probability density F(r, p, t) carrying all its probability information. (9). The condition for conserving the value (A) of a physical quantity A in time (the conservation law for quantity A ) is formally the same as in the generally accepted quantum mechanics: d<A) = 0, if ^^ = i[0(U),0(A)]_ (37) I Kuryshkin 73 However, the fulfillment of this condition with R and A fixed essentially depends on the distinct form of the operator F(r, p, t). (10). The proposed theory in a general case results in concepts, that have no analogue in the generally accepted quantum mechanics. They will be further named 'subquantum' concepts. Among them one could name the 'subquantum' uncertainty 8A of quantity A, which limits the precision of determining the value (A) of this quantity. 5. CONCRETIZATION OF PHYSICAL QUANTITIES AS PHASE-SPACE FUNCTIONS According to the principle of constructing the statistical theories in question, which was formulated in paragraph 2, it is necessary above all to solve problem (20), i.e. to define physical quantities as certain coordinate-momentum-time functions. There obviously exist only two methods of such definition which divide the multitude of the theories under investigation into two classes: (a) all A(r, p, i) coincide with and (b) all A(r, p, t), or at least some of them, differ from the classical ones. The second method involves considerable difficulties (Tyapkin, 1968; Zaparovany and Kuryshkin, 1975) since a constructive approach to the choice of such functions, with the exception of Tyapkin's condition (A (r, p, t) at ft -» turn into classical ones) has not been found as yet. We shall assume, therefore, at least in this paper, that the function depen- dence of all physical quantities A of r, p and t, is given by functions A (r, p, t), representing the same quantities in the classical theory. 6. CONCRETIZATION OF THE STATES' SPACE If, while finding a solution to the problem (20), we made use of the analogy with the classical theory, it would seem quite natural to use the analogy with the generally accepted quantum mechanics when choosing the states' space. Restricting ourselves (in this paper) to the consideration of non-relativistic theories alone, let us define i? as the space of scalar square integrable functions of coordinates, i.e. (38a) (38b) |*(f)> = *(M), <*(0l = **(M) <*i Wl*2(*)> = } <l>*(r, M 2 (r, t) dr For the sake of convenience in further investigations let us represent each operator 0(A) by a generation function A G (r f p, t), related to 0(A) with the help of transformations: A G (r, p, t) = e -° /a)rp O(A)e (i/a)rp (39a) 0(A)U(r, t) = (27ra)" 3 j A G (r, p, t)e 0/aXr - r ' )p U(r', t) dr' dp (39b) 74 Uncertainty Principle and Foundations of Quantum Mechanics where a is constant. Equality (39a) defines the generation function of operator 0(A), while (39b) reconstructs the operator when the generation function A G (r, p, t) is known. Correlations (39) also bring in correspondence to the probability density operator F(f , 17, t ) some generation function / G (£ , 17, r, p, t) where £ and 17 are parameters of the probability operator and its generation function. Then, from the correspondence rule (24), follows a connection of the generation functions of operators 0(A) with the generation function of the probability density operator F: A a (r, p, t) = J A& r,, *)/o(6 V, r, p, t) d{ d V (40) It can be shown that, owing to the positiveness of the probability density operator and its normalization (22a), the generation function f G can always be written as /G(£*?,r,P,0 = Le- (,/a "V fc ■Mv,r,t)\ ' k (i/a)r'p * V%(t, v ,r',t)dr' (41) where fi K (r, p, £, t) is a certain set of functions of the phase-space (r,p), an additional configuration space £ and time t, satisfying the normalization: I f » K (r,p,t,t)n%(r,p,£',t)drdp = 8(£-{') (42) K J 7. THE CONCEPT OF THE 'SUBQUANTUM SITUATION' Accepting the above concretizations of the functions A (r, p, t) and the states' space if, the whole of the statistical part of the theory under investigation is defined by a set of functions MK( r > Pi £ satisfying normalization (42). The set of functions fi K determines the generation function f G of the probability density operator [see expression (41)] and the operator F (39b). In point of fact the operator F itself can be dispensed with since, with the set of functions fj. K fixed, the operators of all physical quantities are singularly determined by relations (39a), (40) and (41). It should be noted that the functions p K themselves have no analogue either in classical or in the generally accepted quantum mechanics, i.e. in accordance with the above-accepted terminology, are 'subquantum' notions. The values of all 'subquantum' quantities, appearing in the theory ('subquantum' uncertain- ties (32), for instance) are determined by a set of functions fj. K . It is therefore suggested that one should say that the set of functions ix K represents a certain 'subquantum situation' in the theory under investigation. Thus, the same physical system can be considered at various 'subquantum situations' (various sets of fi K ) in the proposed statistical theory and vice versa, different physical systems can be considered at one and the same 'subquantum situation' (a fixed set of fi, K )- Kurysnkin 75 The choice of a 'subquantum situation' (a certain set of functions fi K ) gives a single-valued definition of all operators, and, determines, consequently, the results of the theory. A change of the 'subquantum situation' leads to a change of the whole set of results. Therefore, assuming the correctness of the statistical theory investigated here, we are compelled to acknowledge that a 'subquantum situation' reflects a certain physical reality, which has no analogue either in classical or in the generally accepted quantum theory. In a general case, a 'subquantum situation', as a physical reality, can change both in time and in space. For instance, together with the unconditional 'subquantum' uncertainty of the coordinate Sr depending in a general case on t, one can consider a conditional 'subquantum' uncertainty of the coordinate Sr(r , t)= Vmin {((A r) 2 )«U <.<l>r„) (43) where t// zo are all possible states with (/■)= r . The coordinate's 'subquantum uncertainty' (43) is also determined by the 'subquantum situation', but it may depend not only on time, but on the system's location r in space as well. This means that there exists a possibility of the space heterogeneity and anisotropy of the 'subquantum situation'. Note should be made, however, that the 'subquantum situation' in the statistical theories under investigation is given by a set of functions fi K only with the above-accepted concretizations of the functions A{r,p, t) and the states' space if. In a general case the 'subquantum situation' is given by the solution of the whole set of problems (20)-(23). It is essential, that the concept of a 'subquantum situation' is an indispensable part of the quantum mechanics-like statistical theories, possessing the joint coordinate-momentum probability density. 8. THE SIMPLEST CONCRETIZATION OF THE PROBABILITY DENSITY OPERATOR Since in the present paper our task is only to make a brief analysis of the possibilities of the statistical theories obtained, we will hence forward make use of the simplest 'subquantum situation', given by a set of functions Kir, P, i. t) = (2nar /2 <p K (r -fe fle""* (44) where <p K (r, t) is an arbitrary set of squarely integrable functions, satisfying the normalization lfk*(M)| 2 dr = l K J (45) A mere substitution of functions (44) into integral (42), with equality (45) taken into account, shows that the set of functions (44) possesses the required normalization. r 76 Uncertainty Principle and Foundations of Quantum Mechanics Now the coordinate-momentum probability density operator is determined with an accuracy to an arbitrary set of coordinate-time functions <p K (r, t) normalized by condition (45). The concretization (44) of the probability density operator is all the more significant because here the commutator of operators O^r,) and Oipj) does not depend on the distinct form or the number of functions <p K : [0(r,),0(p r )]- = "*«,;' (46) The commutator (46) follows from relations (39b), (40), (41), (44) and (45). Besides 'subquantum situation' (44), in a particular case, can be stationary, space-homogeneous and isotropic. For this it is enough to choose: <pAr,t) = <p K (\r\) (47) The set of 'subquantum' functions (47), owing to relations (39b), (40), (41) and (44), results in a theory which is invariant with respect to time-shifts, translations and rotations of the space. 9. CONCRETIZATION OF THE DYNAMIC FUNCTION In the previous sections the correspondence principle of the statistical theory investigated here with the classical and generally accepted quantum mechanics was used for the concretization of the functions A (r, p, t) and the states' space <£. Since the 'subquantum situation' has no analogue in the indicated theories, the probability density operator F so far remains determined with the accuracy to a normalized set of functions <p K (\r |) and the quantity a, present in correla- tions (39b), (41) and (44). However, in spite of this uncertainty of the theory, the problem of the choice of a dynamic function R(r,p, t) (23), due to the correspondence principle, has been definitely solved. With the accepted concretizations of A (r , p, t) and SB for the coordinate and momentum uncertainties in any state i/t we have: (48a) ((Ar,) 2 )^/) 2 , <(Ap ; ) 2 >>(Sp) 2 <(Ar ; ) 2 ><(Ap,.,) 2 > > j 8ff + (fir ) 2 (Sp) 2 (48b) where the 'subquantum' uncertainties Sr and Sp are the functionals of a set of functions <p K (Kuryshkin, 1972a, 1972b, 1973). Correlations (48) determine the conditions of the transition of the statistical theory under investigation into the classical theory. Since the classical theory allows F-distributions with the coordinate and the momentum precisely deter- mined the conditions for such a transition will be: fir-»0, fip-*0, a-*0 (49) Kuryshkin 77 Differentitating the probability density F(r, p, i) (8) with respect to time and using the evolution equation (19) after performing the limit transformation we come to the following conclusion (Kuryshkin, 1972b): the statistical theory under consideration satisfies the correspondence principle when, and only when aR(r,p,t) = H(r,p,t) (50) where H(r, p, t) is the system's Hamiltonian. Correlations (48) also determine the conditions for the possible transition of the statistical theory under investigation into the generally accepted quantum mechanics. These conditions obviously are: fir-*0, 8p->0 (51) Comparing now commutator (46), correlation (48b) and equation (25) under conditions (50) and (51) with commutator (2), used in the generally accepted quantum mechanics, Heisenberg's uncertainty principle and the Schrodinger equation we come to the conclusion: a = ft (52) Thus, the principle of the correspondence of the statistical theory under consideration with the generally accepted quantum mechanics requires that quantity a which is present in correlations (39b), (41), (44) and (50), coincide with Planck's constant. 10. A PARTICULAR CASE OF THE THEORY AND SOME OF ITS APPLICATIONS The concretization of the functions A (r, p, t ), the states' space <£, the probabil- ity density operator F(r, p, t) and the dynamic function R(r,p, t) introduced in the previous sections, results in a particular case of the quantum mechanic-like statistical theory with the consistent probability interpretation. The 'subquan- tum situation' in this theory is given by a set of functions <p K (\ r |), normalized by the condition: lfk*(|r|) 2 r = l (53) The operators of physical quantities are defined by the correspondence rule 0(A)U(r, t) = (27rhT 3 J <p(r -£,p-i,)A(£, ij, t) (i/h)(r-r')p U(r',t)d£d7,dr'dp (54) where l/(r, t) is an arbitrary coordinate-time function, A(r,p, t) is a phase- space and time function, corresponding to quantity A in the classical theory, ft 78 Uncertainty Principle and Foundations of Quantum Mechanics is Planck's constant, <p(r,p) is an auxiliary function related to 'subquantum' functions <p K (\r\) by correlations: (55) V (r, p) = (2TThT 3/2 e - iimrp I <p K <\r\)tf<\p\) <p K (\p\) = (2TThy 3/2 \e- m)rp <p K (\r\)dr (56) The physical system's state in such a theory is described by the vector (wave function) <p(r, t), normalized by the condition J|*(M)| 2 dr = l (57) and satisfying an equation of the same type as the Schrodinger equation ih »tM=oW*(r.t) (58) dt where H(r,p, t) is the system's Hamiltonian function, 0(H) is the operator corresponding to it in accordance with rule (54). The value (A) of the physical quantity A in the t/r-state is determined by the formula: (59) <A> = J**(M)O(A)*(r,0d/ The mathematical formalism of this theory's particular case given by for- mulae (53)-(59), immediately follows from relations (24)-(26) with the accepted concretization of the function A(r,p, t) and equalities (38), (39a)- 41), (44)-(45), (47), (50), (52) taken into account. A mere substitution of operators (54) into formula (59) involving the auxiliary functions (55), (56) and normalizations (53), (57) results in: (60a) (60b) (60c) (A)=\(r,p,t)F(r,p,t)drdp F{r, p, t) = (2nhV 3 1 | } <p%(\r - Z\fiT m)fp *& d£ JF(r,p,0drdp = l Correlations (60) determine the consistent probability interpretation of the theory. The equation for Fir, p, t) can be obtained from equation (58) with the help of correlations (54), (55) and (60b) (Kuryshkin, 1972c). As has been noted above, the generally accepted interpretation of the wave function $ cannot be accepted here since W{r , t) = | F{r, p, t)dp = \ \M€, tf I \<PK<\r - i\f <* (61) i.e. the square of the modulus 0(r, t) determines, but by no means coincides with, the coordinate probability density Wir, t). , Kuryshkin 79 The statistical theory as represented by equations (53)-(59) does coincide with the 'quantum mechanics with the non-negative QDF (Kuryshkin, 1972c) in the case of the stationary, homogeneous and isotropic 'subquantum situa- tion' (Kuryshkin, 1973). The general theoretical concepts of this theory are at present fairly well studied (Kuryshkin, 1971, 1972a, 1972c, 1973; Zaparovany, 1973; Zaparovany and coworkers, 1975). The solution of actual problems within the framework of the mathematical formalism (53)-(59) yields quite satisfactory results (Kuryshkin, 1972c, 1973; Kuryshkin and Zaparovany, 1974; Zaparovany and coworkers, 1975). Thus, equation (58), for example, results in the energy spectrum of a one-dimensional harmonic oscillator E n = ftw(n+2) + e, n=0, 1, . (62a) where e > is the 'subquantum' energy, related to the 'subquantum' uncertain- ties of coordinate Sr and momentum Sp. e does not affect the level-difference and is not therefore experimentally observable. The calculation of an oscillator average energy in a thermostat results in Planck's formula with the vacuum energy increased by e. A similar problem for an electron in a hydrogen-like atom in a second-order approximation with respect to the coordinate 'subquantum' uncertainty Sr gives the energy spectrum: H'nlm zV 2an 2 + T + £ n l (62b) n = 1, 2, . . . ; / = 0, 1, . . . , n - 1; m = -/, . . . , 0, . . . , /. Here T s is the 'sub- quantum' kinetic energy related to Sp, and e n/ > is the 'subquantum' energy shift, connected with Sr and stipulating the split of the levels over /, resembling Lamb's shift. The result (62b) agrees with the experimental data when Sr = 4.247 xlO" 12 cm (63) Energy levels (62) in contrast to the generally accepted quantum mechanics are not strictly determined. The dispersions of the levels, however, can be calculated when some real functions <p K are chosen. 11. FURTHER CONCRETIZATION OF THE PROBABILITY DENSITY OPERATOR Even in the particular case of the theory, analyzed in the previous section and permitting the solution of some concrete problems, the concretization of F is determined with the accuracy to a set of 'subquantum' functions <p K . It is quite clear, that a further concretization of operator F is out of the question in the absence of some kind of an assumption concerning the physical nature of the 'subquantum situation'. 80 Uncertainty Principle and Foundations of Quantum Mechanics The arbitrary choice of 'subquantum' functions <p K , however, can be partially eliminated by comparing the results of the theory with the experimental data. Thus, for instance, condition (63) obtained as a result of such comparison considerably reduces the arbitrariness in the choice of the 'subquantum situation'. One more opportunity of reducing the arbitrariness in the choice of functions can be pointed out. Thus, experiment shows that the dispersion of energy levels is either zero at at least very small. One may demand, therefore, that the eigenfunctions of the operator 0(H) coincide with the eigenfunctions of the minimum uncertainty equation (31) when A = H. This is possible only with the commutation of the type (34), i.e. [O(H\O(H)]- = (64) Obviously, at H fixed, equality (64) limits the choice of functions <p K . Assuming that condition (64) is fulfilled, one can estimate (qualitatively at least) the energy level dispersions (62). Calculations show: (A£) 2 )„ = C+2e£„, <(A£) 2 )„ /m = 2 +4T (E nlm - To) s (65a) (65b) where 'subquantum' quantities C, e, 2 and T remain non-negative functionals of the set <p K , i.e. the additional concretization (64) and (63) of the probability density operator is not sufficient for a single-valued calculation of dispersions. However, equations (65) allow us to see the qualitative picture for the dispersion change relative to the energy-level increase <(A£) 2 >o=C+eo)ft, <(A£) 2 )„^co^oo 2 2 ((AE) 2 )ooo = 2- 2T °^ e a=0, ((AE)\^ M ^2 Thus, the minimum uncertainty for both the oscillator and the electron in a hydrogen-like atom is inherent in the ground-state (minimum energy) level. Hence, the additional concretization of F, established by equality (64) leads to quite satisfactory qualitative results of the theory. 12. CONCLUSION The investigations, the results of which are set forth in the present Chapter, makes it possible to conclude as follows: (1). There exist a multitude of theories satisfying the principal postulate of quantum mechanics and permitting a statistical interpretation on the basis of coordinate-momentum probability density. These theories Kuryshkin 81 differ from each other in the concretization of physical quantities as functions of phase-space and time, states' space, probability density operator and the dynamic function. The generally accepted quantum mechanics does not belong to their number. (2). Any specific theory out of the multitude of theories under consideration is of a consistent probability character. The existence of the coordinate- momentum probability density F(r, p, t) in such a theory and the functional relations A(r,p, t) of the physical quantities A with coordi- nates and momenta permit the calculation of all sorts of joint and conditional probabilities by the conventional methods of probability theory. (3). Irrespective of the concretization, the theories in question bring into existence certain concepts ('subquantum' uncertainty, 'subquantum situation', etc.), which have no analogue either in the classical or in the generally accepted quantum mechanics. (4). There exists a theoretical concretization which leads to quite satisfac- tory results. In one particular case of this concretization the statistical theory in question turns into the classical and partly (in the realm of physical quantities containing no products of the similar components of coordinate and momentum) into the generally accepted quantum mechanics. (5). Violating Neumann's requirement for quantum operators, the theory in question is not subject to his theorem on the impossibility of 'hidden' parameters. Moreover, such statistical theory requires the introduction of some new physical concepts for the explanation of the physical nature of the 'subquantum situation'. (6). In the statistical theories under consideration the concept of the uncer- tainty of physical quantities acquires a more general character than in accepted quantum mechanics. The correlation of the coordinate and momentum uncertainties (48b) which is nothing but Heisenberg's uncertainty principle reinforced by the 'subquantum' uncertainties is true in the particular concretization of the theory. (7). The utilization of the joint coordinate-momentum probability density in the theory under investigation is equivalent to the assumption that a physical system always possesses quite definite coordinate and momen- tum. The uncertainty principle of the type of tKe Heisenberg principle, therefore, is not in contradiction with the concept of the coordinate and the momentum existing simultaneously as physical realities. (8). The existence of the uncertainty principle in the statistical theory under investigation and the fact that the coordinate and momentum can be considered as simultaneously existing physical realities signify that this theory does not pretend to be a complete description of physical reality. In other words, the proposed statistical theory assumes the existence of a more profound, more deterministic theory, capable of also explaining, among other things, the physical nature of the 'subquantum situation'. 82 Uncertainty Principle and Foundations of Quantum Mechanics ACKNOWLEDGEMENT The author wishes to express his most sincere gratitude to Professor L. de Broglie, Professor Ya. P. Terletsky, Professor J. Lochak and the participants of the seminars at the Peoples' Friendship University (Moscow) and the Henry Poincare Institute (Paris) for numerous and helpful discussions of the problem investigated in this paper. REFERENCES Andrade e Silva, J. L. and Lochak, G. (1969) Quanta, Grains et Champs, L'Univers de connais- sances, Hachette, Paris. ,,<.<■„ c- xt„ < n-j Arinshtein, E. A. and Guitman, D. M. (1967) Izvest. Vusov U.S.S.R., Fiz., No. 5, 123. Bess, L. (1973) Progr. Theoret. Phys., 49, 1889. Blokhintsev, D. I. (1940) /. Phys., 2, 71. Bohm, D. (1952) Phys. Rev., 85, 166. Bohm, D. and Vigier, J. P. (1954) Phys. Rev., 96, 208. Born, M. and Jordan, P. (1925) Z. Physik, 34, 858. Bopp, F. (1956) Ann. Inst. Henri Poincare, XV, 81. Cohen, L. (1966a) /. Math. Phys., 7, 781. Cohen, L. (1966b) The Philosophy of Science, 33, 317. Cohen, L. (1970) /. Math. Phys., 11, 3296. De-Broglie, L. (1927) /. Phys. Radium, 8, 225 . . De-Broglie, L. (1953) La Physique Quantique Restera-t-elle Indetermimstel ', Gauthier- De-Broglie, L. (1956) Une Interpretation Causale etNon Lineaire de la Mecanique Ondulatoire : la Theory de la Double Solution, Gauthier-Villarsj, Paris. De-Broglie, L. (1964) Thermodinamique de la Particule Isolde, Gauthier-ViUars, Paris. De-Broglie, L. and Andrade e Silva, J. L. (1971) La Reinterpretation de la Mecanique Ondulatoire, Dirac! p!a. M. (1958) The Principles of Quantum Mechanics, Oxford University Press, Oxford. Einstein^ A., Podolsky, B. and Rosen, N. (1935) Phys. Rev., 47, 777. Einstein, A. (1948) Dialectica, 11, 320. Fenyes, I. (1952) Z. Physik., 132, 81. Finkelstein, R. J., Fronsdal, C. and Kaus, P. (1956) Phys. Rev., 103, 157 1 . Glasko, V. B., Lerust, F., Terletsky, Ya. P. and Shushurin, S. F. (1958) Zh. Ekspenm. Teor. Fiz., USSR 35 452 Gorshenko'v, V. N. and Kognkov, V. L. (1973) Izvest. Vusov U.S.S.R., Fiz No 7, 140 Gorshenkov, V. N., Denisova, N. A., Kognkov, V. L. and Ryasanova, L. Z. (1973) Teor. Mat. Hz., U.S.S.R., 15, 288. Groenewold, H. J. (1935) Physica, 12, 405. Imre, K., Ozizmir, E., Rosenbaum, M. and Zweifel, P. F. (1967) J. Math. Phys., 8, 1097. Kerner, E. H. and Sutsliffe, W. G. (1970) J. Math. Phys., 11, 391. Klimantovich, Yu. L. and Silin, V. P. (1960) Uspe. Fiz. Nauk, U.S.S.R., 70, 247. Kuryshkin, V. V. (1968) Sb. Nauchn. Robot Aspirantov, Peoples' Friendship University, Moscow, No. 1, 243. Kuryshkin, V. V. (1969a) Isvest. Vusov U.S.S.R., Fiz., No. 4, 111. Kuryshkin, V. V. (1969b) Sb. Nauchn. Robot Aspirantov, Peoples' Friendship University, Moscow, No. 6, 198. Kuryshkin, V. V. (1971) Izvest. Vusov U.S.S.R., Fiz., No. 11, 103. Kuryshkin, V. V. (1972a) Compt. Rend., 274, Serie B, 1107. Kuryshkin, V. V. (1972b) Ann. Inst. Henri Poincare, XVII, 81. Kuryshkin, V. V. (1972c) Compt. Rend., 274, Serie B, 1163. Kuryshkin, V. V. (1973) Int. J. Theoret. Phys., 7, 45 1 . Kuryshkin, V. V. (1974) Teor. Fiz., Peoples' Friendship University, Moscow, 78. Kuryshkin 83 Kuryshkin, V. V. and Zaparovany, Yu. I. (1974) Compt. Rend., 279, Serie B, 17. Margenau, H. and Hill, R. N. (1961) Progr. Theoret. Phys., 26, 722. Mehta, C. L. (1964) /. Math. Phys., 5, 677. Moyal, I. E. (1949) Proc. Cambridge Phil. Soc, 45, 99. Neumann, J. (1932) Mathematische Grundlagen der Quantenmechanik, Springer, Berlin. Pena-Auerbach, L., Cetto, A. M. and Brody, T. A. (1972) Letters alia Redazione, Nuovo Cimento, 5, 177. Rivier, D. C. (1951) Phys. Rev., 83, 862. Ruggery, G. J. (1971) Progr. Theoret. Phys., 46, 1703. Rybakov, Yu. P. (1974) Foundations of Physics, 4, 149. Schrodinger, E. (1935) Naturwissenschaften, 23, 807, 823, 844. Schrodinger, E. (1955) Nuovo Cimento, 1, 5. Shankara, T. S. (1967) Progr. Theoret. Phys., 37, 1335. Shewell, J. R. (1959) Am. J. Phys., 27, 16. Temple, G. (1935a) Nature, 135, 957. Temple, G. (1935b) Nature, 136, 179. Terletsky, Ya. P. (1937) Zh. Eksperim. Teor. Fiz., U.S.S.R., 7, 1290. Terletsky, Ya. P. (I960) /. Phys. Radium, 21, 771. Tolman, R. S. (1938) The Principles of Statistical Mechanics, Clarendon Press, New York. Tyapkin, A. A. (1 968) Development of Statistical Interpretation of Quantum Mechanics by Means of the Joint Coordinate-Momentum Representation, U.S.S.R, Dubna. Weyl, H. (1950) The Theory of Groups and Quantum Mechanics, Clarendon Press, New York. Wigner, E. P. (1932) Phys. Rev., 40, 749. Yvon, J. (1948) Cahiers Phys., 33, 25. Zaparovany, Yu. I. (1974) Izvest. Vusov U.S.S.R., Fiz., No. 6, 18. Zaparovany, Yu. I. and Kuryshkin, V. V. (1975) The article is deposited in VINITI U.S.S.R., No. 2353-75, Dep. Zaparovany, Yu. I., Kuryshkin, V. V. and Lyabis, I. A. (1975) Sovremen. Zadachi y tochnikh naukakh, Peoples' Friendship University, Moscow, No. 1, 89 and 94. Zlatev, I. S. (1974) Compt. Rend., 27, 311. PART Measurement Theory The Problem of Measurement in Quantum Mechanics Ludovico Lanz Instituto di Fisica dell'Universia, Milan, Italy 1. INTRODUCTION Quantum mechanics is itself a statistical theory of measurements. It works very well for some measurements, for example wonderfully well in atomic spectros- copy. It may appear surprising that a theory of measurement should be basic to quantum mechanics since quantum mechanics is explained in most textbooks, as well as being applied and further developed, without particular reference to such a theory of measurement. The peculiarity of quantum mechanics is that one firstly has measurements and then subsequently one must worry about what has been measured. There are two very different attitudes towards quantum mechanics; namely: (1). Quantum mechanics is the fundamental theory of physics; any physical theory is essentially a theory of measurements. (2). Quantum mechanics is the fundamental theory of microsystems; there- fore the theory of microsystems is essentially a theory of measurements on microsystems by macrosystems. A primary objective of physics is therefore to describe the nature of macrosystems. To reach such an objective the physics of microsystems is an essential ingredient. The first attitude is the point of view one learns in textbooks of quantum mechanics, e.g. Dirac's fundamental book. There is an interpretation of physics, first expressed by J. von Neumann (1955) and favoured by Wigner (1971), in which the observer has a fundamental role: physics describes the observations of the observer and his impressions are the basic entities. Since observations are made by measuring apparatuses the following consistency problem arises. Any observable of a system must be equivalent to another observable of a second suitable system interacting with the first one, if the latter is to be interpreted as a measuring apparatus. Von Neumann gives a schematic solution of this problem of measurement. However the impressions of an observer are real but absolutely private entities. An observer cannot point out the impressions he has received from an observation, therefore such impres- sions are outside the realm of any science. On the contrary one needs objects as 87 88 Uncertainty Principle and Foundations of Quantum Mechanics basic entities of physics. One can identify objective aspects in quantum mechanics: there are systems and sets of measurements on them which are dispersionless, i.e. the outcome of these measurements is certain. Then corres- ponding to each measurement of this kind one can attribute an objective property to the system. The set of such properties is the 'state' of a single system. Quantum mechanics has the following general feature. At any time a statistical collection of systems can be decomposed into subcollections such that to each system in each subcollection an objective 'state' can be attributed. One can identify in such properties the basic entities which are measured. Consistently with this point of view Jauch (1968, 1969) and Piron (1964) were able to obtain quantum mechanics as a consequence of simple axioms about yes-no experiments, properties and states. The 'state' of a macroscopic system should embody all the typical macroscopic properties known, for example, in the case of a measuring apparatus, a certain position of the pointer is a macroscopic property. Let A be an apparatus with a pointer. The pointer moves from A to A x when A interacts with a system which has a property p; the pointer does not move when the system has the property p* (p* not to have the property p). Let h A (h s ) be the Hilbert space of the apparatus (of the system); h p <=h s the subspace of h s associated with the property p,/i A c/i A the subspace associated with the position A of the pointer; H is the Hamiltonian of the joint system, t the duration of measurement. One must have c- iH 'P^ o ®Ple iH ' = P^®K e- iH 'P^®Plc iH ' = P^®P s r (1) for all <l/€h p ,ijfe h s Qh p , p* being the projection on i/r; <p A , <p'x e K. Consider now a case in which system S has neither property p, nor p*, but it has another property p' such that h p ' is not contained in h p nor in h Qh p . Let 7j e h P ' be the state of S: ■n^c^ + c 2 4>; </>eh p , j,eh s Qh p ; W = W=1 One has by (1) e--"'/^®^"' = | Cl | 2 P£ A1 ®^. + |c 2 | 2 P: Ao ®P|. + Re c lC f(. . .) (2) This formula contains the whole problem of measurement in the first- mentioned attitude; if the last term were zero, no problem would arise. The collection of systems A+S can be separated into two subcollections, the first containing a fraction |ci| 2 of systems A+S each of which has part A with the pointer at A t , the second Subcollection containing a fraction \c 2 \ of systems A+S with part A having the pointer at A . This is in complete agreement with the physical meaning which in quantum mechanics is given to coefficients. c u c 2 in t> = Ciiff +c 2 <Jf. However the third term in equation (2) is non-zero if c x and c 2 5* 0. It is the infamous interference term. The mathematical reason for its presence is the following: P* o ® P% is a pure state, i.e. an extreme element of the convex set of states. If time evolution is represented by linear, invertible mappings on the set of states, an extreme element is mapped into an extreme Lanz 89 element; if the interference term is zero, the right-hand side of equation (2) is a mixture of two states and not an extreme state, which is impossible. Due to the interference term the pointer of A after the interaction no longer has a position. The notion of objectivity in quantum mechanics is too restrictive to give an account of the objective properties of part A of the system A+S, if A and S interact. More generally if one considers a composite system Si + S 2 , due to the interaction, there are no properties of Sx + S 2 to be described as: Si has a property p y and S 2 has a property p 2 . As long as Si and S 2 are microsystems this can look strange but is not a basic difficulty: it is essentially the Einstein- Rosen-Podolsky paradox; if one of the components is macroscopic one meetsa big difficulty, as it has been particularly stressed by d'Espagnat (1971b). The difficulty lies in the objectivity criterion or in the time evolution law. It is difficult to see how to generalize the objectivity criterion; time evolution is a consequence of symmetry under time translations of an isolated system. Since the interference terms depends critically on small external perturbations, it may be that they are not really meaningful in the physically realizable condi- tions of isolation, as remarked by Zeh (1971). If one considers a system as a subsystem of a larger one, one has no strict condition about its time evolution. 'Non-Hamiltonian' mappings, which are largely used in the theory of open systems, are admissible. Such mappings are not invertible and extreme states can be mapped into mixtures. The idea that any observed system should be considered as an open system has lead Everett (1957) to claim that no system can be isolated from the rest of the universe, in which the observer must also be included. However there are many examples of successful phenomenological theories for isolated macrosystems. The fact that in the quantum-mechanical description of macrosystems elements enter which are highly unstable, or foreign to the system, as Everett's wave function of the whole universe, is an indication that quantum mechanics does not work very well for macrosystems. Therefore, in my opinion, due to the difficulty in the problem of measurement, attitude (1) should be dismissed. The second-mentioned attitude, at least at a linguistic level, was that held by Bohr. He puts the objective character of apparatuses in the foreground and perhaps misleadingly, describes them as 'classical systems'. This was often interpreted in the sense that there are systems, to be used as measuring apparatuses, for which pre-quantum physics is the right theory; since on the contrary quantum effects are well known in macrophysics, the concept of 'classical systems' seems fictitious. Ludwig (1972) has recently formulated a new general theory of macrosystems in which their objective character is built in. He has also developed a theory for a composite system (Macrosystem+ microsystem). (Ludwig, 1966b). The problem of measurement is formally solved; it is shown that the registration of how a macroscopic apparatus is after the interaction with a microsystem is equivalent to the measurement of an observable of the microsystem. In principle such an observable can be calcu- lated in terms of mathematical elements referring to the description of the 90 Uncertainty Principle and Foundations of Quantum Mechanics macroscopic apparatus. Not only a physical content of a statistical operator for a microsystem can be read off from measuring apparatus A, but also the initial statistical operator can be read off from the macro-source S of the microsystem. In conclusion quantum mechanics for a microsystem a produced by the source S and measured by the apparatus A, describes to a certain extent the interac- tion between S and A, the microsystem being the vehicle of such an interaction; in principle the macroscopic description of system S + A can completely replace the quantum mechanics of a. A final primacy of the objectivistic way of describing the world or classical way, is established, consistently with quantum mechanics of microsystems. Such a result does not mean that one has gained a classical insight into microsystems, but rather one has a classical 'outsight' from the microsystems; no classical hidden variables have been found, but a classical anchorage for the microsystems has been achieved. It may be that the search for such an anchorage, which is basic goal of a sane philosophy, was one of the motivations for looking for hidden variables. An important consequence of this point of view is that one has no need for objectivity criteria inside quantum mechanics of a microsystem. Properties and states for single microsystems are interesting but not basic features of the formalism. Peculiarities concerning properties of composed microsystems, as in the ERP paradox are no longer a difficulty. Quantum mechanics should not be based on properties and states of microsystems, but should be a theory for a certain class of experiments which, by means of an interaction between a source S and a measuring apparatus A, give evidence for a microsystem; this is just the starting point of Ludwig's axiomatic approach to quantum mechanics (Ludwig 1970). At such a point most physicists would be disappointed since this theory apparently gives a secondary role to quantum mechanics. However let me stress that the theory of macrosystems proposed by Ludwig is nothing more than a formalization of statistics and 'state' space. Every type of time evolution deterministic or not, Markoffian or not can be placed into it. Obviously all known examples of theories for macrosystems fit into this scheme, which, however, does not help one to find such theories. The sole general and conclusive way to build a theory of a macrosystem is to rely on its atomistic structure and describe it by the mechanics of the microcomponents. So indeed microsystems entered into physics first as an hypothesis, then as objects that could be emitted and revealed by macrosystems. The basic role of quantum mechanics is to provide the concept of a particle and an insight into the interactions between the particles. Particles and their interactions are then the starting point for the atomistic theory of macrosystems, which should finally fill up the formal scheme proposed by Ludwig. The actual theory of atomic structure is quantum statistical mechanics, in brief N-body quantum theory. Therefore we can state the problem of measurement in the following form. N-body quantum theory must yield the concrete input for Ludwig's theory of macrosystems. Quantum theory of an N-body system interacting with a microsystem must yield the concrete input for the new theory of a macrosystem interacting with a microsystem. The important point is the following: we need Lanz 91 not rely on an objectivity criterion in quantum mechanics and pretend that macroscopicity is a 'property' (in the technical sense) of the N-body structure, which leads to the difficulty with the interference term in equation (2). One must simply extract from the N-body theory in a sufficiently general and precise way what is relevant as an input for the new theory of macrosystems and throw away the rest as physically irrelevant. This has not yet been done in a satisfactory way. Anyway the problem of measurement in quantum mechanics is not a philosophical quarrel about the interpretation of the world, nor a basic difficulty of quantum mechanics as in equation (2), but a technical problem in N-body theory. This important conclusion has been reached long before the recent theory of a macrosystem, by Ludwig (1953) and by Daneri, Loinger and Prosperi (1962), who identified in the ergodic behaviour of many-particle systems the technically relevant point. Since these early attempts other main approaches and ideas in quantum statistics such as the master equation theory (Lanz and coworkers, 1971), the independent subdynamics theory (George and coworkers, 1972), C* algebra formalism (Hepp, 1972), have been con- fronted with the problem of measurement; all these attempts suffer from the lack of a clear and general mathematical characterization of macroscopicity. A general very readable survey on the way from microphysics to macrophysics is given by Caldirola (1974). The somewhat utilitarian exploitation of N-body theory by which the problem of measurement should be solved, is justified if one takes into account that N-body theory is a formal extrapolation of the quantum mechanics of microsystems to the case of systems with an extremely large number of particles, in which the quite unobservable correlations between all the particles are described. Let us suppose we have on the one side the quantum mechanics for microsystems, on the other Ludwig's scheme for macrosystems filled with the aid of N-body quantum theory then one can then hope that a new unified theory can be revealed, ranging from microsystems to macrosystems, which could perhaps cover also intermediate systems such as macromolecules (Ludwig, 1972a,b). I shall discuss the problem of measurement having the second attitude mentioned above. In Section 2 quantum mechanics as a theory of measure- ments on microsystems will be discussed, with particular reference to the ERP paradox, in Section 3 a sketchy, but I hope not too distorted, account of Ludwig's theory of macrosystems will be given. Finally in Section 4, the problem of measurement within N-body theory will be stated in a precise way. 2. QUANTUM MECHANICS AS A THEORY OF MEASUREMENT ON MICROSYSTEMS There are interactions between two macrosystems which can be explained in terms of a microsystem which the first macrosystem emits and by which the second one is affected. We shall call the first macrosystem the source or 92 Uncertainty Principle and Foundations of Quantum Mechanics preparation part. The typical feature of such an interaction is that it causes a perturbation which spreads out from one or more pointlike regions inside the affected part. In a space-time description of the affected part these perturba- tions involve space-time points inside one or more cones with axes parallel to the future time axis. The occurrence of such perturbations is very stochastic, i.e. repetition of the same experiment under the same conditions of preparation and of affected parts yields a different pattern of the afore-mentioned point- like regions. The single experiments are not reproducible and it would be meaningless to formulate a theory for them. If a single experiment is repeated n times, the frequencies of occurrence of a certain type of perturbation can be measured; if n is large enough such frequencies are reproducible and it is worth while formulating a theory for them. In fact such frequencies have in a certain sense a universal character. They are completely independent from very many features of the preparation and the affected part. Such a situation is an obvious consequence of the atomistic structure of matter: microsystems effect only some atoms of the affected part by interactions that have a universal character. We shall mean by an experiment on a microsystem a statistical collection of a large number of single experiments in each of which a single microsystem is emitted and revealed. It consists in principle in the repetition of the following steps, (a) production of the microsystem by a source which has been prepared by taking into account a certain set a of macroscopic prescriptions, the same procedure to be taken in the n repetitions, referred to a frame R; (b) observation of whether a measuring apparatus, which has been prepared taking a certain set /3 of macroscopic prescriptions into account, is affected or not in a certain prescribed way y (also fi, y are referred to R and are fixed in the n repetitions) (c) counting how many times n + the apparatus is affected. The frequency n + /n is the quantitative result of the experiment. Let us call (R, a) a preparation procedure and (fi, y) a measuring procedure. Ludwig has obtained the axiomatic structure of quantum mechanics, in a generalized form, starting from suitable axioms about a physical input consist- ing of preparation procedures, measuring procedures and frequencies (Lud- wig, 1970). Let me summarize the result in the particular case of no superselec- tion rules. The set M of experiments which give evidence for a microsystem is described by means of a Hilbert space h M with the following interpretation. Each statistical operator W on h M represents a class of equivalent preparation procedures, where two preparation procedures are equivalent if one has the same frequencies for any measuring procedure. Each operator F on h M , such that 0<F<I, represents a class of equivalent measuring procedures, where two measuring procedures are equivalent if for any preparation procedure one has the same frequency. The operators F are called 'effects' by Ludwig; in the conventional axioma- tics of quantum mechanics measuring procedures are associated only with projections on h M . The frequency of the effect F, when the preparation W is made, is given by Tt(FW). The symmetry of the theory under time-translations implies that a semigroup V(t), t > 0, exists of mappings of the set L M of effects Lanz 93 into L M , such that V{t)I = I, I being the identity operator. V(t) is contractive on U(h M ), the Banach space of linear bounded operators on h M (Comi and coworkers, 1975). In the standard axiomatics of quantum mechanics one requires that Y{t) can be extended to a group; then projection operators are mapped into projection operators and V(t) must have the following structure: Y{t)F= V(t)FVUt) V(t) being a unitary group on h M • Let iH be the generator of V(t ), then H is the hamiltonian of the system. Let V'(t) be the adjoint operator of V(t); V'(t) maps the set K(h M ) of all statistical operators on h M into itself. V(t) is a contractive operator on r C(h M ), where tC(/im) is the Banach space of 'trace class' operators on h M . Due to the definition of V'(t) one has: Tr(( V(t)F) W) = Tr(FV (t) W) Let us consider a preparation procedure W=(a,R) and the transformed preparation V(t)W; T'(t) W is the preparation procedure W shifted back in time by a time interval of length t. The relation between W and V'(t ) W can be described as follows: f(t) W consists of the preparation W and of waiting a time t after W is finished. In the time interval [0, t] only the microsystem evolves, therefore V'(t) can be considered as the time evolution operator of the microsystem. Space-time symmetry has further consequences: in the non-relativistic case (for simplicity we consider only this case) one has on h M a unitary, projective representation of the group G of space translations, accelerations and rota- tions. Let us assume that such a representation is irreducible and that the theory describes at least the effects linked to the most simple perturbations of the effect parts; such most simple perturbations are those spreading out from one point-like region. The representations of Go which satisfy these require- ments are characterized by two indexes, a half-integer numbers s and a positive number m, to be interpreted as spin and mass of an elementary particle. By suitable choices of the parameters s and m certain sets M of experiments can be described: the corresponding microsystems are the most simple ones, they are 'one particle' systems. Let us consider two experiments concerning two particles I and II and let us build a correlated experiment, in which the two preparation procedures and the two measuring procedures are performed together. Such correlated experiments can obviously be described by preparations and by effects W i ®W ii £X 1 ®K i, c t C(/i i )(8tC(/i 11 ) F I ®F n 6jL I ®L n «=B(A I )®B(fc 11 ) 94 Uncertainty Principle and Foundations of Quantum Mechanics Since TC(h I )®rC(h I1 ) = rC(h I ®h n ) and B(h I )®B(h lI ) = B(h l ®h 11 ) W^W 11 and F l ®F 11 are very particular statistical operators and affect the Hilbert space h l ®h 11 . The axiomatic structure of quantum mechanics leads us to assume the existence of a microsystem to be associated with the Hilbert space h 1 ®^ 1 , i.e. each WeKih 1 ® h l \ FeL(h l ®h 11 ) should in principle correspond to a preparation and a measuring procedure. On h ® h one can place a unitary representation of G x G and at least effects linked with two-point perturbations of the affected part can be described. Then symmetry allows the following structure for V(t); V(t) = e' ', H = // I ®I I1 +I I ®iJ I1 +f/ int , H iat describing the interaction between the two particles. Microsystems of this type with a statistical operator such that interaction plays a role, so that H int can be tested, are prepared in all scattering experiments and are emitted from macrosystems in many spontaneous decay processes. In conclusion if I and II are particles, a microsystem (I, II) also exists with Hilbert-space /t 1 ® /i" and one can explain experiments about (I, II) by a suitable choice of H iat , at least in the non-relativistic case. Let us investigate the structure of microsystem (I, II) and its relation with particles I and II. Consider effects of the form F l ® I 11 and I 1 ® F 11 , one has where Tr (F 1 ®I n W) = TtAF 1 W l ), Tr (7 1 ®F U W) = Tr„"(F n W 11 ) W l = Tr„<W0 eK\h l ) and W n = Tw(W)eK u (h u ) for physically meaningful /-f nt one has that for t large enough, t > i, T{t) W can be replaced with V\t) ® ¥' l \t)if; if being a 'collision' mapping of K onto K. Then f or t > i one has Tr„w (F 1 ® I 1 V (f ) WO = Tr„< (FV *(0 W 1 ), W 1 = Tr„" iff W) and similarly if 1*511. The interpretation of these results is straightforward: we have two particles I and II, which for r > i, no longer interact and are described by the initial statistical operators W 1 and W . We can look for the occurrence of the joint effect F 1 ® /" and I 1 ® F u , to be represented by F 1 ® F n , which has a frequency Tr (F 1 ® F n WO. In conclusion: the microsystem (I, II) is a pair of particles I, II and a correlation law for joint measurements. The problem of the description of a microsystem (I, II) has been - raised by the well-known ERP paradox (Einstein and coworkers, 1935). To make the discussion more specific let us consider the following example given by Bohm (1951): two spin \ particles emitted in a singlet state by a suitable source move in opposite directions and the components of the spins S t and S 2 in two directions n a , n 2 are measured, e.g. by means of two Stern-Gerlach Lanz 95 magnets followed by two revelators. If u± n are the normalized eigenvectors of S l z n corresponding to eigenvalues ±\, one has, neglecting for simplicity the spatial coordinates: I 1 W = F^, ^ = 4-(w+®"--"-®"")eC 2 ®C 2 I , W l = \2 2 W n = I u Pl®P"„ Pl®^ l '2> Tr(P l B ®P 1 *P ll ,) = 0, Tr(Pi®PL^) = i Let P„ be the eigenprojection of S n = S„ corresponding to the eigenvalue \; for the effects P\®I l \ /'®P", P\®P l L one has the probabilities Tr [P\® / n P ) = Tr {I l ®P^) = Tr(Pi®P II 11 P^ = i Tr(Pi®PL I y P <A ) = i The microsystem consists of two particles with isotropic spin statistics; it has the following property: total spin = 0; no spin property can be attributed to particles I and II. The simple correlation law can be anticipated from rotational symmetry. Take a collection of microsystems (I, II) with preparation P^, put an apparatus for measuring S„ in the path of one of the two particles. Theoretically the frequency of S„ = i (~j) is \. Then put an apparatus for measuring S n - on the path of the second particle and again measure the frequency S n = \, (-|). The two sets of apparatus can coexist since [Pn®I u , ?®Pn] = We have the same effect as before Pi ® I" = P\ ® P" + Pi ® P" and the same result as before. Apparatus II does not influence in any way the physics of particle I. One has the identity Tr (Pi ® I U P*) = Tr (P l B ® I a W m .) with W m - = 5 W„. + kW m ; W n . = 2P\- ® P"P*Pi ® P" , Vn' We see that as far as particle I is concerned, we can describe the collection as built from two equal subcollections of microsystems with the property S n - = 2 for the first one and the property 5- D . = \ for the second one; this for any direction n', obviously without any consequence for particle I. Let us use measurement II to build a new statistical collection: we select those microsys- tems for which the apparatus II has yielded the result S" = 2 , (-5) and look for the effects produced by the microsystems of this collection. The statistical operator for this new collection is W n ; ( W- n ) and a single microsystem of it has the properties, S" = \, (S"„' = \). If Si'v& measured the probability of the result 96 Uncertainty Principle and Foundations of Quantum Mechanics Si = 5 depends on n-n', e.g. in the case n = z n' = x(y) it is \ in agreement with the probability 1/4 of S\, S" (y) for the initial collection. The transition W-> W a - is not a consequence of the interaction of particle II with the apparatus II, but is a consequence of the repreparation procedure in which measurement II is used. We stress that such a repreparation cannot coexist with another apparatus that measures S"», nV n'; therefore the two decompositions of W according to the measurements of s£, S" cannot be made together. There is nothing peculiar or paradoxical in this description. However it is possible that one would like to consider the microsystem (I, II) as a system of two correlated particles I and II in which case difficulties would arise. The correlation should be a correlation between properties of the two particles. Unfortunately due to the interaction, W l , W n are not pure states even if W is a pure state, this means that if a property can be attributed to a microsystem (I, II), such a property is not expressible as a property of particles I and II. Thus one has nothing to correlate and no vector state can be associated to the components I and II of microsystem (I, II); d'Espagnat describes this peculiarity as 'non-separability' of (I, II) into I and II (d'Espagnat, 197 la). In my opinion this indicated that one should not ascribe a basic role to the concepts of property and of state for a single microsystem, as I have already stressed in Section 1 where attitudes (1) and (2) were discussed. If attitude (1) is chosen one can assume that not the whole set Kih 1 ®^ 1 ) is physically meaningful, but only statistical operators with the following structure: i' where P) ® Pf are projections on states of the form Statistical operators of this kind are called by d'Espagnat mixtures of the first kind, while the other mixtures are called of the second kind. This can be looked upon as a mixture of pure states u)®uf in each of which a property / of particle I is correlated to a property / of particle II. Therefore in such a case microsystem (I, II) could be considered as a system of two correlated particles. However any state P$ with il, = au\®v" + Pu2<S)V2 must be excluded; if I and II are two identical fermions, usual quantum mechanics claims that all pure states have this structure. It seems difficult to eliminate such states which provide the energy levels for atoms in excellent agreement with experimental results. However one could expect that scatter- ing states, with well separated particles should be described as two correlated particles, e.g. in the example we considered before, one could expect that the singlet state transforms into a mixture of the kind (Janch, 1971) W = j^\ P\,®Pn dn Lanz 97 A behaviour of this type has been discussed by Bohm and Aharonov (1957). Then V(t) must transform pure states into mixtures. Usually one assumes that V(t) is a group, this excludes the afore-mentioned behaviour. However, in the framework of Ludwig's axiomatics, as it is shown by Comi and coworkers (1975), the assumption that T'(t) is a group, seems to be unnecessarily restrictive. V(t) can be a semigroup of linear mappings of K into K. Then the required behaviour can arise at least asymptotically (Barchielli and Lanz, 1975). A very important point is that measurable correlations are different for separable and for non-separable microsystems (I, II). Such differences are the same as those which discriminate between the existence or non-existence of local hidden variables. In fact a violation of Bell's inequality would prove the existence of non-separable microsystems. (Selleri, 1971; Kasday, 1971). Let me comment briefly on the relation of quantum mechanics to hidden variables theory. We started the discussion of experiments about microsystems observ- ing that for practical reasons only a statistical theory is needed, since single experiments are not reproducible. However one cannot exclude that a more fundamental theory exists which could be applied to very hypothetical, perhaps non-realizable, preparations, which are so accurate that all effects are repro- ducible, i.e. for each effect theory tells us whether it occurs or not. Since the basic concept in macrophysics is the concept of state space, as will be shown in section 3, it is appealing to associate with a microsystem a 'hidden' state space Z*\ which we assume to be a measure space, with a suitable set K of measures on a suitable <r-algebra of subsets of Z". Each preparation part prepares a microsystem to be represented by an element of Z. To a measuring part one associates a ^-measurable function tj(z) which assumes the value unity if the measuring perturbation occurs or zero if it does not occur. In a statistical experiment one can assume that a measure (i w € K corresponds to a preparation procedure and an 'average' function fj(z) corresponds to a measuring procedure, with 0^t}(z)^1 and rj(z) J^-measurable; then the probability of an effect F after a preparation W would be rj F (z)dfi w (z) The very hypothesis of the existence of a 'hidden' state space Z M for a microsystem does not contradict the basic axioms of quantum mechanics, if the latter is intended as the statistical theory of a certain class of interactions between two macrosystems; this is not so obvious if quantum mechanics is the theory of properties of a microsystem. A famous negative theorem about the existence of 'hidden' variables has been given by von Neumann (1955) and in a more sophisticated way by Jauch and Piron (1963). The physical relevance of these negative theorems has been criticized in an important paper by Bell (1966). While existence or not of hidden variables has little to do with quantum mechanics, the properties of the function rj F (z) which refers to a measuring procedure and not to a measuring 98 Uncertainty Principle and Foundations of Quantum Mechanics part must be confronted with quantum mechanics. Bell considers the class of local hidden variables. Let me translate Bell's definition of local hidden variables into the language of the present discussion. Consider the effect F=(F l F 2 ) = (Fi andF 2 ) where F u F 2 refer to spatially well separated effects, then the locality condition is iW 2 )(*) = Vf^v^z). It is just such a require- ment which makes hidden variables useful to describe a microsystem (I, II) and to explain the ERP paradox. Let us consider effects such that the pairs Fu F 2 ; Fi, F' 2 ;Fu F 2 ; F[, F 2 are coexistent and satisfy the locality condition. Typically Fi'Fi'(F 2 ,F 2 ) can correspond to two different orientations of the same apparatus,'e.g. the symmetry axis of a photon linear polarization analyser can be oriented in different directions. Then for any preparation, the probabilities P (Fl , F2) , P (F „ Fi ), P&M, P(Fi^) satisfy the inequality (Bell, 1971). |P(F 1 ,F 2 )-F(F 1 ,F 2 )| + F(F' 1 ,F 2 )+F(Fi,F 2 )^2 In the mathematical theory of microsystems (I, II) provided by quantum mechanics there are effects and preparations (having the feature of 'non- separability') which do not satisfy Bell's inequality. Therefore local hidden variables do not complete quantum mechanics, but contradict certain of its statistical predictions. It is a very important experimental problem to gain any evidence of a violation of Bell's inequality; this would rule out local hidden variables and indicate the reality of 'non-separable' microsystems. Recently interesting results on the two-photon system have been obtained by Kasday (1971) and by Clauser and Freedman (1972), which indicate a violation of Bell's inequality. 3 THEORY OF MACROSYSTEMS A macrosystem is such that at any time one can say 'how it is'; physics is supposed to give a mathematical description of how 'a macrosystem' is. In place of the phrase 'how it is' let us speak of the 'state' of the macrosystem at time t and represent such a state by an element z(t) of a suitable space Z. More precisely, let us consider a macrosystem which for times f>0 is isolated; Ludwig postulates that its objective qualities at any time t > are represented by a point z{t) in a state space Z Such a description of a macrosystem can be called, realistic, objectivistic or somewhat misleadingly 'classical'. Classical refers to the fact that it was the sole attitude of physicists before the develop- ment of quantum mechanics, but does not mean at all that one pictures a macrosystem as an assembly of molecules described by classical mechanics or that one derives all electromagnetic phenomena from the Maxwell equation. To make clearer what is meant, consider a black body at equilibrium: its state can be specified by a description of its walls, the temperature T and the distribution U{v) of electromagnetic energy density on the eigenfrequencies of the electromagnetic field; neglecting all aspects of the walls except the volume V of the hollow part, one has z = ( V, T, {u (v)}, zeZ; the average value of the variable U(v) is the well-known Planck radiation law, in which the 'quantum Lanz 99 theoretical' constant h appears. All statements about a macrosystem finally refer to a suitable space Z, which depends on the kind of system and on the level of the description. Examples of this are as follows: macrosystems schematized by a set of mass points, having at any time a position x(l)eR and a momentum p,-(f) e R 3 , j = 1, 2 . . . k, which can be represented by an element of R ; a fluid in local equilibrium inside a region ft <= R 3 , is described in hydrodynamics by a mass density function p (x) e 5£ x (ft) by an internal energy density u (x) e <£ (ft) and by a velocity field i)(x)eiT(ft); then the fluid can be represented by an element of .2 ,1 (ft)xi? 1 (ft)xi? 00 (ft). A dilute gas in ft is almost completely described by the Boltzmann distribution function f(\, p) e 5£ (ft x R ). Fortunately enough, in many cases fluctuations of the state are very small so that statistics can be forgotten, the difficult problem of defining measures in a function space can be avoided and actual states identified with average states. By many examples, e.g. the afore-mentioned ones, one is lead to assume that Z is a complete, metric space. The space r=C(@ + ,Z), + = (O,+oo), of all continuous functions z{t), t>0,z(t)e Z, is called trajectory space. On Y one can define in well-known way a topology, by which Y becomes a metric, complete space. However the corresponding metric d(y, y') has not a direct physical meaning. If two trajectories y, y' are physically appreciated to be in a certain vicinity this is not well represented by a condition such as d(y, y')<C the latter criterion being too restrictive. Ludwig shows that a new metric can be defined, leading to physically meaningful vicinities, which induces in Y the same topology (but a coarser uniform structure) as the topology C c (@+, Z) of the uniform convergence on compact subsets of ©+ ; Y with such a new metric is not complete. Its completion Y is a compact Haussdorf space; on Y the set of all continuous functions z (t ), t > is dense. On Y a continuous time translation operator T(r) can be defined for T>0:T(r)y = y' where y = z(f)<^y' = z(t + r),\/yeY. Let us consider the Borel cr-algebra 38 (Y) and the set of all signed Radon measures on S3 (Y). To this set a Banach space structure can be given, it coincides with C'(Y), which is the dual space of the Banach space of continuous function on Y. A preparation procedure of a statistical collection of macrosystems is represented by a positive, normalized Radon measure u(<o) on 53 (Y). Once such a measure is explicitly given the whole statistical dynamics of a macrosys- tem of the prepared collection is known. In fact u(<o) is the probability that the trajectory of the macrosystem belongs to w <= Y The whole physics of a macrosystem of a given type (i.e. describable in a given space Z) is known if the convex set of all possible preparations K m cC(Y) is known. Suppose K m Is given, let us see how the physics of the macrosystem can be gained. Consider the weak closure K m of K m in C'( Y) ; K m is convex and compact in the a{C'{Y), C(Y)) topology; the extreme points u t of K m are 'elementary preparations', 100 Uncertainty Principle and Foundations of Quantum Mechanics each preparation being a mixture of them. Then consider «, g K m . The support of !!•(») is the set of functions z(t) and also of limit points of Y in Y, which are possible trajectories for the macrosystem; if the support of u, reduces to a point yi eY one has a deterministic dynamics, Y, being the trajectory of the macrosystem. If one assumes that the cylindrical sets aV = {y:*(')ei?,i|e»(Z),f>0} are «, measurable, «,(«*,) is the probability that the state of a macrosystem of the prepared collection belongs at time t to rj c Z. Obviously in the determinis- tic case For any r > 0, u e K m , «KJ is a positive, normalized measure on 38 (Z). If for any u u u 2 eK m the equality Ui(o v ,t) = M2(«„,«), Vtj g »(Z) « <e, e arbitrary >0 implies u x = u 2 ,~the theory is Markoffian. More general cylindrical sets can be considered -, lh . 1I- ,_^-{y:*ft)e*i-i,2...*,ii,6a(z);*>o} then u.-K,,^...,**) provides a full description of the time correlations. Symmetry under time translations implies that if u e A m also u e A m , u \a>) - «(77 1 o)), t > 0; the mapping u -* u, defines a semigroup Y\t) of endomorph- isms of C(Y), u = T(t)u, which maps K m into K m . The great advantage of this formulation is as follows: no assumption about the dynamics of a macrosystem enters into the mathematical structure of the theory which, however, is precise enough to solve formally the problem of measurement in quantum mechanics. The unusual concept of trajectory space can be avoided at the price of the following, perhaps wrong, assumption about the dynamics of a macrosystem: by a suitable choice of Z the dynamics of a macrosystem is Markoffian. Classical mechanics in phase space, hydrodynamics, the Boltzmann descrip- tion of a gas are examples of Markoffian theories. In this case the preparations of a collection of macrosystems are represented by a set K t of positive normalized measures on S8(Z), Z being a suitable compactification of Z, and a semigroup V^t) of endomorphisms of C(Z) exists which maps K± into K±. If u g K± is a statistical preparation of a macrosystem (V(t)u)(ri), y e S8(Z) is the probability that the state z of the macrosystem belongs at time tto a set tj <= Z. For a g 38 (Z) let us define the linear operator \<r ° n C '(Z) as Or»(tl) = «(o-riT|) then OurOk-fe-ik, .^,r(fi)«)(z) is the probability measure of the cylindrical set of Y:{y :z(f,)Gi7,-, i- 1, 2, . . . k, tj, g 38 (z)}. Let us consider a preparation procedure W g X(/t ) of a Lanz 101 collection of microsystems and a preparation u n g X m <= C'( Y) of a collection II of macrosystems and let us correlate effects of the microsystem with observa- tions of the trajectories of the macrosystem. Then the preparation of the composite system can be described by a positive normalized measure defined on Y with values in rCih 1 ): H I ' u (a»)=W I HV), <ogS8(Y) the probability that the microsystem produces the effect F and the trajectory of the macrosystem belonging to a set <o c Y is given by Tr ( W'FVV) = Tr (« I ' II («)F I ) Therefore one is led to describe in general the preparation of the system: microsystem I + macrosystem II, by a suitable set X" 1 ' 11 of positive normalized measure u I,n (cj) on &(Y), with values in rC(h l ), where normalization means that Tr fc '(« u (Y)) = 1. Tr (FV U («)) is the probability that effect F 1 occurs and the trajectories of the macrosystem belong to the set a> g 38 ( Y). By symmetry under time translations a semigroup of affine applications V'(t) of K ' into JC 1 ' 11 must exist, representing a preparation consisting in preparing u and waiting a time t, i.e. the free evolution during a time t, after the preparation u. A measurement procedure on a microsystem with statistical operator W can be described in the following way. One has a system composed of an affected part, prepared with the preparation procedure u and of a microsystem prepared with the preparation procedure W l . After a time T chosen in such a way that the micro- and the macrosystem, have interacted, one looks at the trajectories of the macrosystem with no regard to the microsystem. The probability that the trajectories of the macrosystem belong to a set <u g 38 ( Y) is given by: p(w) = Tr {{Y{T)$WW l )((o)I l ) = Tr ((y\T)$WW l ){a>)) where ( W 1 • u lI )((o) = W l u(ca), $ being a suitable affine mapping of Kih 1 ) x K m into K 1 ' 11 . Since p(w) is an affine functional of W l on K{h l ) and Q<p(w)< 1, there exists a uniquely identifiable effect F I (w)eL(ft I ) such that p{<o) = Tr (FV) W 1 ). The set of effects F\w), <o e j% ( Y) is an 'effect'— valued measure on the o--algebra of Borel subsets of Y; it defines an observable of the microsystem. This notion of an observable is a straightforward generalization of the usual representation of a set of compatible observables, by a set of commuting self-adjoint operators A 1 ,A 2 ...A k ;ln fact one has the corres- pondence {A,, i = 1,2... k}+*P(E), Fe38(R*), P(E) being the common spectral measure of A u A 2 . . . A k such that l,= [ A, dP(A) P{E) is a projection valued measure on the cr-algebra of Borel subsets of R . The generalization consists in replacing the projection valued measure with an effect valued one and R fc by Y. Let the effect part have a pointer whose position 102 Uncertainty Principle and Foundations of Quantum Mechanics is x e R; then one can write with obvious notations z = (x,z'), <o(t ,E) = {y:x(t )eE} Ee<53(R) and consider the effects F l0 (E) = F(a>(t , E)). For fixed *„ and Be <I(R) one has an effect valued measure on R, if in particular the effects are idempotents, the operator A dFjA) J— OO would be an ordinary observable. Therefore we see that affected parts pre- pared by a preparation procedure u" and left to interact with a microsystem in a fixed time interval T, identifies an observable of the microsystem, which could be explicitly calculated if u", T(T) were explicitly known. Observables corresponding to different u are in general not compatible. It one assumes that, apart from superselection rules, all elements of L(h i ) : are effects, one has a deep, yet unexplored, link between the structure of Hubert spaces and superselection rules for microsystems, the structure of spaces Z for macrosystems and the interactions between micro- and macrosystems. In this treatment it has been assumed for simplicity that the microsystem is not absorbed by the macrosystem, i.e. the possible transition of system (I, II) to system II is not taken into account. Let us consider the microsystem after the interaction. The fact that after a suitable time i the interaction is negligible can be formalized by rWW'-ii = Y°{t)<p$W l -u n , t > i where T°(t ) is the no interaction time evolution mapping (T (t)u)M= e - iH MT; l <o)e mit and <p is an affine mapping of K l ' n into K u \ which describes the 'collision' of the microsystem with the macrosystem. The probability of an effect F with no regard to the macrosystem after the preparation W l -u 11 and its evolution in a time t, is given by Tr (e-%>(j? {W l -u n )(t))c iHl 'F l ) = Tr (W\t)F l ), with W\t) = e- iH >'<pMW 1 -u n )(Y)e iH >' A Consider any covering of Y by a numerable set of disjoint Borel sets «, of Y ; correspondingly one has the following decomposition t>i, W\t) = l Pi (t)W)(t) i where Lanz 103 each component of such a decomposition is correlated to a certain set 77 «, of trajectories of the macrosystem. In such a way the measuring process is explained and an explicit link between F and the description of the affected parts has been obtained. It is also possible to link W with the preparation parts. Finally the probability of F with a preparation W, can be formally expressed as the probability that the trajectories of the composed system, preparation part + affected part, belong to a certain set of the trajectory space of such a composed system (Ludwig, 1972b). Notice that, in attitude (2) of Section 1, the question about the statistical operator of a microsystem after a measurement (by which it is not absorbed) cannot be solved within the axiomatics of quantum mechanics: one only knows that a statistical operator exists, which represents a preparation including the interaction with the apparatus. Concepts such as 'measurements of the first type of a complete set of observables' are artificial ingredients by which simple exercises for students in quantum mechanics can be given. Let me remark that I have made statements as 'the probability that the trajectories of the macrosystem belong to a certain set <o c Y\ skipping for simplicity the problem of how such an objective fact can be ascertained. Such a point is treated in the theory of Ludwig, who formalizes the concept of 'registration' of trajectories. A concrete registration procedure, e.g. a registra- tion by our senses, which registers certain trajectories and discriminates other ones, is always affected by an uncertainty in the registration of the trajectories which are at the boundary between the accepted and the rejected ones. To a registration procedure of trajectories a continuous function /(y) on Y corre- sponds such that 0</(y)< 1; /(y) is called a 'trajectory effect'. The set [0, 1] of C(Y) represents the set of all registration procedures. The probability of registration / for a macrosystem with preparation u is given by jjf/(y) du(y); i.e. by the value at/ of the functional which represents u in C\Y). Idealized registration procedures which accept or reject trajectories without uncertainty are represented by characteristic functions x<oiy) of the Borel subset of Y; in such cases one has **,(y)d«(y) = u(a>) J Y which is the result we have used. Analogous considerations hold for the composite system: macrosystem + microsystem. Trajectory effects have an important formal role: the whole theory can be put in a mathematical form which exhibits the same linear and order structures as quantum mechanics. Such formal resemblance could be relevant for the problem of connecting the axiomatic theory of macrosystems with TV-body quantum mechanics. 4. RELATION TO TV-BODY THEORY The main tool for the description of macrosystems is TV-body quantum theory, which in many applications can be replaced by TV-body classical statistical 104 Uncertainty Principle and Foundations of Quantum Mechanics mechanics. The practical success of this theory is very great and difficulties can be attributed to excessive technical difficulties. The general pattern by which such success is achieved can be described as follows. Let h be the Hilbert space of the N-body structure, H its Hamiltonian, L(h) the set of effects and K(h) the set of statistical operators. In correspondence to a space Z, typically a space of n-tuples of functions q>,{£), £ e R\ j = 1, 2 . . . n, one guesses for a set of fields £((), f e R k j = 1, 2 . . . n of self-adjoint operators in h and a set of statistical operators K. For WeK, is interpreted as the average value of q>j{£) at time t for the statistical collection described by W at t = 0. Sometimes also expressions Tr(e'%«>-'™-to(fl>.) 2 W) are calculated and interpreted as dispersions. An example of this procedure is as follows: Z = space of Boltzmann distribution V functions (distribution functions) for a gas £(x,p) = j* + (* + §>(*— §^d« 4>(x) being the field operator in the second quantization formalism. In general such a procedure meets many purposes in macrophysics but does not yield the statistical distribution on the trajectory space that underlies such average values and dispersions. Using such a procedure one does not have a sufficient input for Ludwig's theory of macrosystems. The simplest way to provide such an input would be to find a measure F(a>) on SB ( Y) having values in the set L (h ) of effects of N-body theory, such that e Uft F(»)e-" ft = F(77 1 ft») '^0 the last requirement arising since e"*... e' iH ' is the time translation mapping on L{h) and a is a subset of a trajectory space. Then u w (<») = Tr (F(a>) W), for all We K(h), would be an element of K m . The family F(w) is an observable by Ludwig's more general definition. It is the macroobservable of the JV-body system, which corresponds to the family of idealized effects xM in the theory of a macrosystem. One does not expect that such a strong solution of the problem exists. In fact, due to the macroscopic irreversibility of u w eK m it follows that u w T , where W T is W transformed by a time inversion, should no longer be sensible. Therefore one must give a 'weak* form to the previous requirement. A possible form could be: (ai). On S8(Y) a measure F(<o) with values in L(h) must exist and a set K <=■ K(h) can be found such that for all We K Tr (e iH 'F(w)e- iH 'W) = Tr (F(77^) W) u e -"« WB >»i(o) = u w (T, 1 <o), t>0 Lanz 105 The requirement that F(o) is a measure on 38 (Y) is mathematically very restrictive; physically it means that the apparatuses A M which, by a measure- ment on the Af-body structure, register the sets o> of trajectories, can all co- exist; since the apparatuses which measure a structure of ~10 23 particles are very hypothetical objects, the physical meaning of the previous statement is questionable. Therefore one is led to give a 'weak' formulation also for the measure character of F(w) and in place of (aO one requires: (a 2 )- On 38(Y) a function F(<o) with values in L(h) must exist and a set K <= K(h) can be found such that VWeK, u w (a>) = Ti(F(to)W) is a measure on &(Y), u e -'»' We ' H io>) = u w (TJ l (o), t > Since in our attitude N-body theory is only a provisional tool to find the right theory a less strong requirement is meaningful, such as the following one: (a 3 ). On 3&( Y) a family of functions F s {a>), 8 > with values in L(h) exists, and a family K s a K(h) such that for all W s e K s , lim Tr (F s (a>) W s ) s-»o exists f or all <a e 38 ( Y) and defines a measure u { w*}M on S8 ( Y) (no existence of lim F s ((o) or of lim W s is required) further «-»o a-*o U{e- tH 'W 3 e iH, }M = U {W »)(TJ l <o), t > Analogous considerations can be made if one ^assumes that the mac- rodynamics are Markoffian; then essentially one has Z in place of Y and this is a simplification, but the second part of a becomes the following: A semigroup V'(t) exists on K m such that U'(t)u w = u e "«- We '«; t>0, 17 e & (Z) and this is a complication. If assumption (aj) is further restricted by the requirement that F(w) is idempotent it leads to the well-known problem of the macroobservables as self-adjoint commuting operators on h. Such a problem has no solution of appreciable generality and indeed it seems-to be too naive a formulation of the problem of macroscopicity. Anyway since there is not a logically compelling reason to assume that dynamics in state space is Markoffian the usual master equation approach seems to be not completely appropriate for solving the problem. The main step to obtain F(<u), <oe 38 (Y) is to build F(<w) for <u being cylindrical sets (17^1, ri 2 t 2 , ■ ■ • T7 fc r fc )<= Y, f,>0, tj ( 6 S3(Z) which is already sufficient for most physical applications. It is also the sufficient input for the rather technical problem to prove the existence of the function F(a>), w e 53 (Y), which assumes assigned values on the cylindrical sets. A final assump- tion on F(a>) and on K which possibly has some implication for the allowed interactions between a microsystem and a macrosystem is as follows: 106 Uncertainty Principle and Foundations of Quantum Mechanics (fi). For each microsystem S with Hilbert space h s the following family of elements of K(h s ) «wW<») = Tr„(/ S ® F(a>)e- iH 'W s ® We'"'), t > for all and for all WeK, is still a measure on 38(f); H is the Hamiltonian of the system: N-body structure + microsystem S; more generally according to (a 3 ) one could substi- tute the right-hand side of this equation by (fi 3 ) limTr,,(I s ®F 5 ( < y)e _ a-»o r 'W s ®W s e iH ') If F(a>) and K were known, u^wjM would be the required input for Ludwig's formal scheme and the problems of measurement would be solved. In conclusion the difficulty with measurement in quantum mechanics has been shifted from formula (2) to the problem of building F(w ) and identifying K such that (a 3 ) and (/3 3 ) hold. The limit 8 -» in (o 3 ), (fi 3 ) should introduce macro- scopicity as a limit situation of N-body theory. One can hope that this is only a technical difficulty. Acknowledgement The treatment in this section is based on a paper I am preparing with Dr. G. C. Lupieri. I wish to thank Dr. Lupieri for useful discussions about this subject. REFERENCES Barchielli, A. and Lanz, L. (1975) 'Non Hamiltonian description of two particle systems,' I.F.U.M., 181, F. J. Milano. Bell, J. S. (1966) Rev. Mod. Phys., 38, 447. Bell, J. S. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed., Academic Press, New York. Bohm, D. (1951) Quantum Theory, Prentice Hall, New Jersey. Bohm, D. and Aharonov, Y. (1957) Phys. Rev., 108, 1070. Caldirola, P. (1974) Dalla Microfisica alia Macrofisica, Mondadori; Milano. Comi, M, Lanz, L., Lugiato, L. A. and Ramella, G. (1975) /. Math. Phys. 16, 910 (1975). Daneri, A., Loinger, A. and Prosperi, G. M. (1962) Nuclear Phys., 33, 297. Einstein, A., Podolsky, B. and Rosen, N. (1935) Phys. Rev., 47, 777. d'Espagnat, B. (1971a) Conceptual Foundations of Quantum Mechanics, Benjamin, New York. d'Espagnat, B. (1971b) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed., Academic Press, New York. Everett III, H. (1957) Rev. Mod. Phys., 29, 454. Freedman, S. J. and Clauser, J. F. (1972) Phys. Rev. Letters, 28, 938. George, G., Prigogine, I. and Rosenfeld, L. (1972) Dansk. Mat. Fys. Medd., 38. Hepp, K. (1972) Helv. Phys. Acta, 45, 234. I Lanz 107 Jauch, J. M. (1968) Foundations of Quantum Physics, Addison Wesley, Reading, Mass. Jauch, J. M. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed., Academic Press, New York. Jauch, J. M. and Piron, C. (1963) Helv. Phys. Acta, 36, 827. Jauch, J. M. and Piron, C. (1969) Helv. Phys. Acta., 42, 842. Kasday, L. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed. Academic Press, New York. Lang, L., Prosperi, G. M. and Sabbadini, A. (1971) Nuovo Cimento, 2 B, 184. Ludwig, G. (1953), Z. Phys., 135, 483. Ludwig, G. (1970) Lecture Notes in Physics, 4, Springer, Berlin. Ludwig, G. (1973a) Lecture Notes in Physics 29, Springer, Berlin and 'Makroskopische Systeme und Quantenmechanik', Notes Math. Phys. Marburg (1972). Ludwig, G. (1973b) Lecture Notes in Physics 29, Springer, Berlin and 'Mess-und Praparierprozesse', Notes Math. Phys. Marburg (1972). von Neumann, J. (1955) Mathematical Foundations of Quantum Mechanics, Princeton University Press, Princeton. Piron, C. (1964) Helv. Phys. Acta. 37, 439. Prosperi, G. M. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed., Academic Press, New York. Selleri, F. (1971) Foundations of Quantum Mechanics Proceedings of the IL Enrico International Summer School, B. d'Espagnat, Ed., Academic Press, New York. Wigner, E. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed., Academic Press, New York. Zeh, H. D. (1971) Foundations of Quantum Mechanics, Proceedings of the IL Enrico Fermi International Summer School, B. d'Espagnat, Ed., Academic Press, New York. The Correspondence Principle and Measurability of Physical Quantities in Quantum Mechanics YURI A. RYLOV Institute of Space Research U.S.S.R. Academy of Sciences, Moscow I. Introduction Fifty years have passed since the discovery of the uncertainty principle by Heisenberg. Quantum theory has scored big successes in many regions of physics especially in atomic and nuclear physics and solid-body theory. All physicists are agreed upon the formalism of quantum mechanics. But there is no agreement on questions of the interpretation of quantum mechanics and measurement theory. Most physicists merely ignore these problems, correctly believing that they are negligible in calculations of different quantum systems. There are many shades of interpretation of quantum mechanics and problems of measurement even in orthodox quantum theory. Some different viewpoints are found in numerous hidden-variable theories. Bibliographies on problems of quantum mechanics measurement and interpretations can be found in surveys by Margenau (1963), Pearle (1967), Ballentine (1970) and Reece (1973). In this paper I am going to consider only the question about constraints imposed upon the measureability of physical quantities by the time-energy uncertainty relation. I shall not consider other questions connected with the interpretation of quantum mechnaics and measurement theory and shall only make a few remarks about them. I shall confine myself to the statistical interpretation of quantum mechanics.* According to this interpretation a wave function provides a description of the statistical properties of an ensemble of similarly prepared systems. Ballentine (1970) ascribed this interpretation to Einstein (1949), Popper (1959) and Blokhintsev (1968). In my opinion, the same interpretation was upheld by Mandelstam (1950) in his brilliant lectures on the theory of indirect measurements, which were given in Moscow State University in 1939 and were printed only in 1950. The statistical interpretation differs from the Copenhagen one (Heisenberg, 1955), which asserts that the wave function provides a complete and exhaustive *This terminology is used by Ballentine (1970). 109 110 Uncertainty Principle and Foundations of Quantum Mechanics description of an individual system. Ballentine (1970) has shown that the stronger constraint, which is used in the Copenhagen interpretation, is of no importance in applications of quantum mechanics, but in some cases leads to paradoxes. All sensible quantum mechanics statements, being statistical ones, always concern no single dynamical system but an ensemble of systems. For instance, the statement that a measurement of spin component <r x in the state with a x = 1/2 gives the result value -1/2 with probability 0.5 is pointless, if it concerns only one single system. Really, this statement means that, measuring the component a x many times in the similarly prepared state, one gets the value ax = -1/2 in half of the cases. But many measurements cannot be performed on one system, because its state is perturbed after the measurement. For this reason it is necessary to have an ensemble of systems and to perform measure- ments upon many of them. There is another possibility: preparing the same initial state of the same individual system, to measure repeatedly. But this is an ensemble also. The proposition, that there are sensible statistical statements about indvidual systems in quantum mechanics, is an illusion. To make a statistical statement sensible, it is necessary to deal with an ensemble of systems. Hence, there is no reason to insist that a wave function describes a state of an element of an ensemble (a single system), but not the ensemble as a whole. However, if the Copenhagen terminology is understood not word for word but as a peculiar physical slang, used for brevity, then there is no objection to it. Following von Neumann (1932), let us give now the main statements of quantum mechanics. (Al). Any state of an ensemble consisting of many identical systems is represented by a definite self-adjoint operator U (called a statistical operator or state operator), which is denned in a Hilbert space X. The operator U has a real non-negative eigenvalues and obeys the conditions SpU=l (1) (A2) An observable physical quantity R is represented by a self-adjoint operator R in the Hilbert space X. A physical quantity F(R) is represented by the operator F(R). In particular, the canonical variables q',pt (i = 1, 2, . . . n)of a classical system are corresponded by operators q, p t (i = 1, 2, ... n) in the space X, which obey the commutation relations *U -> ->/ its' (2) where I is the unit operator on X and h is Planck's constant. A function F(q, p) of canonical variables q, p is represented by the self-adjoint operator F(q, p). The order of the operators is chosen in some way which is not fixed. (A3). The mathematical expectation (R ) of an observable R in the state U is described by the relation (R)0 = Sp{UR} (3) Rylov 111 (A4). The state U of the physical system ensemble evolves in such a way that after time t it turns into the state U, U-»U t = fT i " ,/h Ue fu "' (4) where H is a self-adjoint operator on X. The H, called the Hamiltonian, is a function H(q,p) of q and p. For physical systems, which have a classical description, the form of this function coincides with that of the Hamiltonian function of coordinates and momenta. (A5). Measuring a quantity R on the physical system ensemble which is in the state U, turns the state of R after measurement into a state U', which is defined by U+& = U&Xn,Xn)pM (5) n where (iff, x) denotes the scalar product of vectors tff and x on %• Xu X 2. • • • > Xn ■ ■ ■ is a complete set of orthonormal eigenvectors of the operator R • P^n] is a projection operator on vector *„. The projection operator on a unit vector <p is defined by the relation where / is an arbitrary vector on X. The superselection rules (Wick and co-workers, 1952) must be added to these propositions. But I shall not do this, because I shall not use these rules, and the above propositions do not pretend to be an axiomatization of quantum theory. In certain cases the physical system ensemble state can be determined by pointing a unit vector </f on X. Such states are called pure ones. Their statistical operator can be represented in the form U = P M (7) where P w is the projection operator on the unit vector if/. In this case due to (6) the statement (3) takes the form (R)<, = Sp{RP^ = I {x n , RPmXn) = I (*,*,)(*., £*)=(*,**) (8) The relation (4) can be written as , , -iHl/h , </> -» tp, = e if/ (9) The set of vectors t//-xi,X2, ■ ■ -Xn- ■ • represents an orthonormal basis in Hilbert space X. The vector i/r, describing a pure state of the physical system ensemble is called the vector state or wave function. All statistical statements of quantum mechanics can be derived from the propositions (A1)-(A5). The detailed analysis of these statements can be found in the monograph by von Neumann 112 Uncertainty Principle and Foundations of Quantum Mechanics (1932). The formal measurement theory can be derived from the above statements. This was shown by von Neumann (1932). The relation (5), describing the violation of the ensemble state under the action of measuring a quantity R, represents the special process, which differs from the evolution of the ensemble, described by relation (4). The quantum mechanical process of measurement has two aspects. The informative side of the measurement of a quantity R represents a registration of some definite statistical distribution of values R in the given ensemble state. The perturbing side of the measurement describes the violation of the ensem- ble state after measurement. The last property can be used for the preparation of the system ensemble in a definite state. The two sides of measurement are independent to such an extent, that Margenau (1963) insisted on distinguishing between the measurement process (registration) and that of state-preparation. Really, the two sides of measurement are independent to such an extent, that they can be realized in principle by means of two different devices. One of them only prepares the state but does not record it, and the other only records but does not disturb it. Let us consider such an idealized measurement process. Let an ensemble E+ be described by the wave function t/» and consist of N similar single systems (iV-»oo). Let the preparation part M of the device, measuring the quantity R act on the systems of the ensemble E+. One can imagine the preparing part M as a black box with one inlet and many outlet slits Si, S 2 , Let all outlet slits be closed except for the slit S x . The JV X systems (1 « Ni « N) of the ensemble E+ find themselves in turn in the box M. A proportion of them is absorbed by the box and a proportion passes out through the open slit S x .* Let n 1 systems pass through the slit Si (1 « «i « AM- The box M transforms a part of the ensemble in E+ into another ensemble JEi, consisting of n x systems. In other words the box M prepares the ensemble E x in a certain state, which is not known exactly. In order to visualise the recording part of the measurement let us turn h systems of the ensemble E 1 (l«l u k 1 = n 1 -l 1 »l) into a recording macroscopic device 3" with a 'pointer'. Let us assume that the influence of the sytem upon the device 9" deflects the pointer. The magnitude of this deflection shows the measured value of the quantity R. Let us assume that the box M has such an arrangement that an analysis of h systems of the ensemble E\ by means of the measuring instrument 0> gives the result R i for all these systems. We shall not be interested in what happens to the systems in the instrument 9 and shall not consider them. Because the measurement has given a value R x of the quantity R for all l x systems (h » 1), we have the right to conclude that the quantity R has the value i?i for the rest of the ensemble E r *For concreteness one can imagine any physical system as a moving electron. The measuring device measures the electron momentum. The black box M represents a region with a magnetic field, which is orthogonal to the direction of the electron motion. Depending on the magnitude of the electron momentum the electron is deflected through a certain angle and hits the screen with the slits S„ S 2 , If only slit S x is open, then such a device prepares an ensemble of electrons with momentum p!. Rylov 113 (fci systems, k x = n x - h » 1)1". This conclusion can be drawn without analysing these fci systems by measurement with instrument 0*. Thus by subjecting N x systems of ensemble E^, to dynamical interaction in the box M and analysing part of them in the instrument &, we get the ensemble Eu consisting of fei systems. The quantity R has the value Ri in all systems of ensemble £,. If the ensemble E^, is a pure one then the ensemble Ei must be a pure one and be described by a certain wave function fa, because the ensemble systems were subjected only to dynamical interaction with the box M. But the measurement process on the ensemble E$ is not finished, because only the one possibility that some systems have value R=Ri has been investigated. For the investigation of the possibility that among the systems of the E# there are ones, which have the value R 2 of the quantity R, it is necessary to open only the slit S 2 and to allow to pass through M N 2 systems (N 2 » 1, N 2 « N) of the ensemble E^,. Passing through the slit S 2 n 2 systems (n 2 » 1) of the E*, one forms an ensemble E 2 . Analysing l 2 systems (l 2 »l,k 2 = n 2 -l 2 »l) of the E 2 and finding a value R 2 of the quantity R for all of them one concludes that the rest of the systems of E 2 have the value R=R 2 and are described by a wave function fa. An analogous process has to be produced for all slits Si, S 2 , . . . . Let us note that the distribution of the ensemble E^, systems according to values of R is produced independently of reading the pointer 0>, i.e. in the absence of any information about the state of the ensemble E^,. Of course, the box M is supposed to be arranged in such a way that any slit S m corresponds to a definite value R = R m . Let us suppose that all N m (m = 1, 2, . . .) are equal and that the box M is arranged in such a way, that by all open slits Si, S 2 , ... a system of the ensemble E#, finding itself in the M, is to go out through one of the slits. Supposing that l m = an m , where < a < 1 and the a is the same for all m = 1, 2, . . . , one concludes that the number k m (m = 1, 2, . . . ) of systems in the ensemble E m is proportional to the probability of measuring the value R as equal to R m among the systems of the E^,. Let us unite all ensembles E u E 2 , . . . into one ensemble E'. The E' is a mixture of pure ensembles E u E 2 , . . . and cannot be described by means of a wave function. It is conditioned in that the E' is obtained as a result of a set of different dynamical actions but not one dynamical action. Really, some systems are subjected to the action of the box M with the only open slit Si, other ones with slit S 2 only open and so on. It is different dynamical actions, which lead to different results. I shall call this action the statistical action keeping in mind that the statistical character of the action manifests itself in a different dynamical action of the measuring device upon systems with different values R. Of course the objection can be made to the above that it is not necessary to open only one slit in the box M. One can open all slits or even not use the box M. One can merely take a small part of the ensemble E+ and investigate the tThe concept of ensemble was introduced into quantum mechanics in order to know the state without perturbing it (von Neumann, 1932, Chap. 4, Section 1). r 114 Uncertainty Principle and Foundations of Quantum Mechanics distribution of the values R by means of the measurement instrument $P. The same statistical distribution of the values R will be found in the rest of the E+. Thus, one can find the distribution of values in the E+ without disturbing it. This is a valid objection. But it takes into account only the informational side of measurement, neglecting the state-preparing side. Essentially it is equivalent to the perfectly correct statement, that our knowledge about the ensemble state does not have an influence upon the state of the ensemble. However the measurement is not reduced merely to a change of information about the state of the ensemble. The measurement influences the ensemble state. All physi- cists are agreed upon this question. There is discordance of opinion only upon the question of how it influences the state of the ensemble. Influence of measurement upon the state of a system being measured is the main difference between the quantum theory of measurement and that in classical theory. Let an electron ensemble state be described by a wave function iff. Then |(Hq)| 2 d V represents a probability of finding the electron in the volume d V. To measure the electron position means, that it is necessary not only to measure a distribution |e/r(q)| 2 for the ensemble, but to determine the action of this measurement upon the ensemble description. To measure the electron position and to find it in the volume dV means selecting all the electrons of the ensemble, which have been found in the d V, to constitute a new ensemble of them and to solve the problem of the description of this new ensemble. Just such a problem arises in quantum measurement theory. For the elucidation of the nature of this problem I have divided the united measurement process into two parts: an informational part and a state-preparation part. Such a division is an idealization, which is possible only if the state preparation and the recording process happen instantaneously, and if the change of the ensemble state due to the process (4) can be neglected. Usually the measurement device cannot be divided into parts: the informa- tional one and state-preparation one. Besides nobody measures in the way that has been described, i.e. firstly the systems having the value R = R i are selected, the rest of them being given up, secondly the systems having the value R=R 2 are selected, the rest of them being given up and so on. Such a selection, produced blindly, is ineffective. In practice the measurement is performed in the following way. One finds the value of the quantity R for a single system. Depending on the value obtained for R, the system is attributed to one of the ensembles E U E 2 , For this reason some physicists believe that the appear- ance of a mixed state of the ensemble is connected with a change of information to an observer. Other authors connect its appearance with the fact that the measuring device is a macroscopic one. Some physicists reject the reduction of the pure ensemble state to the mixed one, stating (quite correctly), that dynamical action cannot reduce the pure state into a mixture (Wigner, 1963). If in addition to considering that the wave function describes a state of a single system and to understand this word for word, then the measure- ment action upon the system to be measured assumes in general a mystical character. Rylov 115 Thus, the quantum measurement process is a set of single measurements. This term will be used later on just in this sense. To measure the quantity R and to obtain a value R' means performing a set of single measurements of R, the selection of those systems for which the measurement has yielded the result R' and the constitution of a new ensemble of the selected systems. The measure- ment is in the first place a statistical action upon the ensemble systems, which can be accompanied by a dynamical one. The statistical action of measurement is conditioned by the statistical character of its description in quantum mechanics.* The system selection is an attribute of a measurement. The means and manner of how this selection is produced is of no importance. In any case this selection is not a result of a change of observer information about the ensemble state, because, as we have seen, this selection can be produced blindly without any information about the ensemble state. The relation (5) describes a result of the measurement action upon the ensemble in the state U. The measurement is supposed to be performed instantaneously and the state evolution, described by (4). can be neglected. One can have doubts, that the measurement action is described by a projection operator P^„] upon eigenvectors {%„} of the measured quantity operator, and propose another way. I believe that this is not a very essential detail. I have chosen the measurement action in the form (5) for the reason that its properties have been investigated in detail by von Neumann (1932). Unfortunately, the measurement problem is not exhausted by this consider- ation. Quantum mechanics always attributes a result of measurement to a state U of the ensemble to be measured. For such an attribution to be possible, the measurement would have to be performed sufficiently quickly, in principle, instantaneously. This means the following. Let the measurement of the quantity jR continue during the time T and a set {R T } of results be obtained. This set {R T } depends in general on the duration T of a single measurement. If a limit of distribution {R T } with T-» exists, then by definition, the measurement of the quantity R can be performed instantaneously. For an instantaneous measurement the result can be attributed to the state U, in which the ensemble has been found directly before measurement, even if the ensemble state has changed during the measurement process. However, it is possible that some quantity R cannot be measured instantane- ously, i.e. no limit of distribution {R T } of measurement results exists for T-» 0. For instance the energy and momentum of a particle are such properties. In *The action of measurement upon the ensemble state takes place in the theory of Brownian motion, where the dynamical action of measurement can be neglected certainly. For instance, let an ensemble E w of Brownian particles be described by a function W(q, t) satisfying the Einstein- Fokker equation. To measure the position of the Brownian particle at the same time f and to find it in a volume V means selecting from the ensemble E w only those particles which have been found in the V at the time t = t , and to constitute a new ensemble E Wo , described by the function Wo(q, /), which does not vanish only within V at t = t . Later at t > t the ensemble E Wo will evolve in a different way from the ensemble E w . In other words, the measurement at the time / changes the probability of detecting the Brownian particle at the point q at the time t >t although no dynamical action has been made upon the particle, and only selection (i.e. statistical action) has been effective. 116 Uncertainty Principle and Foundations oi Quantum Mechanics accord with the uncertainty principle the smaller the measurement time Tis the greater is the inaccuracy of the measurement of energy and momentum (Bohr, 1928; Heisenberg, 1930; Landau and Peierls, 1931; Mandelstam and Tamm, 1945; Fock and Krylov, 1947; Aharonov and Bohm, 1961; Fock, 1962). For a non-vanishing time T the measurement result cannot be attributed to the ensemble state directly before the measurement. Really, if this were possible, then it would be unclear why the measurement results have no limit for the non-vanishing measurement time T->0. The measurement results can be attributed to the ensemble state U during the measurement process only if the U is unchanged (or is changed very slightly) during the measurement time. Thus, if the measurement requires a non-vanishing time, then its result can be attributed to the ensemble state only for those states for which the change according to the relation (4) is negligible during the measurement time. Using the relations (4) and (5), let us produce a formal consideration of the measurement process, continuing the time T during which the measurement instrument is switched on. Let the measurement of the quantity R be per- formed within the period [0, T] and the ensemble state be described at the time instant t = by a statistical operator U . Let the operator U evolve within the period of time [0, t] according to (4), turning into U, at the instant t. Let the measurement process (5) be performed at the moment t the U, turning into UJ. Let the U', within the period [t, T] evolve according to (4), turning into U', at the instant T. A simple calculation shows that during the time T the statistical operator Uo turns into \J' T (10) iflt/h U -» fVr = Z (V e ifI ' /h Xn, e iH ' /h Xn)Vei e -^~'»\j where Xu X2, ■ ■ ■ is a complete orthonormal set of the operator R eigenvectors. The instant t, at which the measurement process is performed, is supposed to be indefinite but within the period [0, T]. For the described process to be a real measurement of the quantity R it is necessary for the state Urdepends only on the initial state U and the operator R. In particular the U' T has not to depend on the instant t, at which the measurement (5) has been performed. It follows from (10), that the last condition is fulfilled, if the vectors {*„} are eigenfunctions of the Hamiltonian H and, hence [R,ft]-=RH-M = (11) This result can be found in von Neumann's book (1932, Chap. 5, Section 1). It means, that the action of the measurement device upon the measured system during the measurement process is to be of such a kind, that the Hamiltonian H would begin to commute with the operator of the measured quantity. Suppose the relation is fulfilled. The question arises of to which state the measured values should be attributed. The instant of measurement is indefin- ite, and the ensemble state U changes within the period [0, f] according to (4). For the measured values of the R to be attributed to a definite state U, it must Rylov 117 be stationary within the period [0, f].* This leads to a condition [U o ,H]_ = (12) which is a great restriction upon U . In particular the measurement of the distribution over the momenta of a free particle is possible (in the one- dimensional case) only if the momentum value is quite definite. The above formal consideration is not consistent because on the one hand it uses the instantaneous measurement process, but on the other hand the measurement is supposed to continue for a non-vanishing period of time. Nevertheless it indicates that a long duration of measurement of some quan- tities should give rise to obstacles to their measurability. The necessity of a long measurement of physical quantities of the energy- momentum pattern is conditioned in the end by the time-energy uncertainty relation. Unlike the position-momentum uncertainty relation (Heisenberg, 1927; Robertson, 1929) it cannot be derived from the statements (l)-(5) of quantum mechanics. Really, the formalism of quantum mechanics contains the time t as a parameter, which commutes with the Hamiltonian H and does not conjugate to H in the sense (2), as q and p do. Thus, the time-energy uncertainty relation is an additional statement, which should be taken into account in the formalism of quantum mechanics. It does not permit the instantaneous measurement of the energy-momentum pattern quantities and leads to the restrictions (11) and (12). All this is in contradiction to the basic statements (l)-(5) of quantum mechanics, which supposes the instantaneous measureability of physical quantities in any state, and apparently is connected with the non-relativistic character of quantum mechanics. The subsequent analysis shows, that in reality relation (12) cannot be fulfilled for quantities of the energy-momentum pattern. In other words, the measured values of energy and momentum can never be attributed to any definite ensemble state. In this sense energy and momentum are not measureable, and this is the corollary of the time-energy uncertainty relation. Later on I shall use the coordinate representation of vector state. In this representation every vector iff in the Hilbert space $? is represented by a square-integrable function iff of the coordinate q. The scalar product (tp, tff) of two vectors tp and iff is represented by (*,*)=JV(q)*(q)dq (13) * denotes a complex conjugate. Integration is produced over all coordinates q. The position operator q and the momentum operator p are defined respectively as the operator of multiplication by q and as the differentiation operator A = -«^ (14) *Within the period [t, T] the state V', is stationary due to (1 1). 118 Uncertainty Principle and Foundations of Quantum Mechanics 2. The Possibility of an Experimental Test of the Statistical Statements of Quantum Mechanics In the present section I shall investigate the possibility of an experimental test of the statistical statements, represented by (A2) and (A3). Because this problem is very complicated, I shall confine myself to the investigation of a measurability of the simplest physical quantities such as coordinate and momentum. The complication consists of the impossibility of producing a formal analysis. For instance in measuring a momentum on the one hand it is stated that the momentum operator is represented by (14) and this operator has to be used in (8) for the calculation of the corresponding mean values, on the other hand it is necessary to describe some measurement process which is by definition the momentum measurement. If the measurement results disagree with the statements (8) of quantum theory, then its proponent can always say that this measurement process is not a momentum measurement, and for momentum measurement another measurement process should be used. To reduce the number of possible measurement processes I shall require that any mesurement satisfies the correspondence principle. This means, that a measurement process and its result must not depend on the model of the phenomena used in the measurement (classical, quantum or some other kind). In particular, the measurement process applied to a system which permits a classical description must give results, which agree with classical mechanics. The term 'correspondence principle' was introduced by Bohr (1918) to establish a connection between the old (before 1925) quantum mechanics and the classical theory of radiation. Originally it meant that the radiation fre- quency emitted by transition from one quantum orbit to another approaches asymptotically one of the frequencies which are obtained from a Fourier series of functions describing the motion of the electron. In the contemporary version of the quantum mechanics the correspondence principle describes some correspondence between the formalism of quantum mechanics and that of classical mechanics. For instance, the operator p = - ihd/dq which is unlike the classical momentum, is interpreted as momentum and in many cases (for instance in approximate estimations) is substituted by the classical momentum. This is the corollary of the correspondence principle. The following consideration is a base for such a correspondence. Let a particle ensemble be described by the wave function / (iS(q)) i/r = i/,(q) = -Jp(q) exp j— — | (15) where p and 5 are real functions of q. For simplicity only the one-dimensional case is considered. If p and p = dS/dq change slightly within the distance of a wavelength A = h/p, i.e. 1 dp pdq « dS dq ldp pdq 1 « — ft (16) Rylov 119 then such an ensemble can be described classically with S(q) as an action. The density state of such an ensemble in a phase space (q, p) is described by the distribution function W{q,p) = p{q)8(p- d A (17) Indeed, one obtains from (8) and (15)-(17) for (F(q, p% = j" **(q)F(q, ~ ih ^{q) dq = = | p{q)F{q, f)*q = \ Hq, p)W(q, p) dq dp (18) For this reason one concludes, that the operator (14) corresponds to momen- tum. The momentum is interpreted in the sense of classical mechanics. The function W(q, p) describes an ensemble of classical systems. The state of every system is described as a 'point' in the phase space. Every 'point' has a volume, which is greater than the characteristic volume ft of the phase space. The essential dependence of distribution (17) on the only coordinate q is con- ditioned by the fact, that the ensemble state is pure, i.e. the ensemble state is described by a wave function, not by a mixture of them. To avoid a measurement process description for every single quantity it is natural to use the rich experience of classical mechanics. The correspondence principle is used for this purpose. For instance, it follows from the correspon- dence principle that the quantity p described by the operator (14) is to be measured in the same way as a momentum is measured in classical mechanics. The measurement connects the quantum mechanics formalism symbols with phenomena of the real world and puts a content and a sense into these symbols. Referring to classical mechanics, the correspondence principle formalizes a relation between the formalism of quantum mechanics and measurement. The measurement procedures for different quantities are supposed to be worked out in classical mechanics. Because the correspondence principle is the least formalized part of the theory, I shall make it responsible for a possible disagreement between experiment and the quantum theory formalism. This means that the experimental test of the statistical statements of the quantum theory is considered as a test of the correspondence principle. While relation (8) describes only expressions for mean values if it is valid for all self-adjoint operators, then, as von Neumann (1932, Chap. 4) has shown, it contains all the statistical statements of quantum mechanics and permits the calculation of the probability of measuring a given value of any quantity R. In particular it follows from (8), that in every single measurement one can obtain only that value R', which is an eigenvalue of the operator R. Suppose, for instance, that a self-adjoint operator R has a discrete spectrum of eigenvalues R u R 2 , Let for simplicity every eigenvalue /?, (i = 1, 2, . . .) be related to only one eigenvector Xi- The vectors corresponding to unlike 120 Uncertainty Principle and Foundations of Quantum Mechanics eigenvalues are orthogonal. Let us normalize them in such a way, that fa Xk) = } **(qkfc (q) d q = «* j, fc = 1, 2, . . . (19) An arbitrary wave function iff can be represented in the form *(q) = I«Wi(q) ( 2 °) It follows from (19), (20) and the normalization condition of the wave function iff, that Iflf*=Ik| 2 =l < 21 > i i Let us calculate the mean value (F(R))+ of a function F of the quantity R. Because the \xt} are eigenvectors of the operator R RXi=RiXh ' = 1,2, ... one obtains from (8) and (20): <F(i?)>*=XF(i?,)H 2 (22) Since the (22) is valid for every function F, then it follows from (21) and (22) that the quantity R can take only values R t (i = 1,2,.. .). The |a,| is the probability that the quantity R has a value R t in the state (20). Thus any single measurement of the quantity R has to yield one of the eigenvalues /?, of the operator R. It is appropriate to point out, that the last statement follows from (A3) only in the case when the condition (A3) is fulfilled for all operators R. Let us consider the problem of particle-position measurement. For simpli- city, the particle is considered to be charged (for example an electron). Let there be some macroscopic device (generator), which prepares an electron in some state. For instance, an electron gun can serve as a generator. Let us imagine many similar generators (an ensemble) which are in the same macro- scopic state. Every generator prepares an electron upon which a single measurement is performed. Single measurements performed upon different electrons yield, in general different results. The complex of single measurement results permits the determination of the quantity distribution in a state ijr in which we are interested. Let a detector, capable of recording the time at which an electron passes through it, be spaced some distance from the generator. The detector can be represented by a Geiger counter or other similar device. Let t be the operating time of the detector. Let a set of experiments be performed. During each experiment the generator is switched on, and, if the detector trips, then it records the time of its tripping. If the detector trips in a period t after switching the generator on, then this means, by definition, that during the period (t - r, t) the electron was found in the volume of the detector. Thus the electron coordinates in this period coincide with the detector coordinates within the precision of the detector size. Rylov 121 Let a set of N experiments be performed at a fixed position of the detector. Let N be the number of times the detector has not tripped, Ni the number of times it has tripped during the period t [i.e. in the period (0, t)], N 2 the number of times it has tripped in the period 2t and so on. We have the limit N = No + N! + N 2 + ... lim N s /N N-»°o represents the probability of detecting an electron in the time st within the space taken up by detector. For this reason one has N, |*(q,«0| 2 dV = N' 1,2,. where i/r is the electron wave function, q are the detector coordinates and d V is its volume. By performing measurements with different dispositions of the detector with respect to the generator, the |^(q, st)\ 2 can be calculated for different positions q and times st (s = 1, 2, . . .) within the detector size and operating time t. The detector size and its operating time are reduced as far as possible in order to increase the accuracy of the calculation of \ift(q, st)\ . An optical or electron microscope can be used if necessary. I shall not describe the measurement of the particle position by means of the microscope but refer to the paper by Mandelstam (1950). It should only be noted that by using a photon (electron) beam of sufficiently high energy, the position and the registration moment of an electron can be determined in principle with any desirable accuracy. This means that the electron coordinates can be measured with arbitrary accuracy and instantaneously. Of course, a statement of such a kind is an idealization of a real state of affairs. Nevertheless I shall adopt this thesis, remembering that within non- relativistical physics nothing hinders in principle the accurate and instantane- ous measurement of the electron position, if the energy is not too high and pairs generation can be neglected (see, however, Landau and Peierls, 1931; Pauli, 1933, Section 2). Let us consider the problem of the measurement of electron momentum. In classical mechanics it is supposed that the influence of measurement upon electron motion can be made infinitesimal. So to measure the electron momen- tum it is sufficient to measure two neighboruing positions q and q+Aq of a single electron which are separated by a short period of time At and to calculate the electron velocity v = Aq/At. Thereafter the momentum p is defined by p = mv = m Aq At (23) Formula (23) determines the mean momentum over a period of time At. Measuring momentum in reducing periods of time At one obtains in the limit 122 Uncertainty Principle and Foundations of Quantum Mechanics the exact value of the momentum. In this case the limit Aq/Af with At -*0 is supposed to exist. In quantum mechanics such a method of measurement is also possible. As the position can be measured, in principle, instantaneously, it is possible to measure two positions of an electron at two instants separated by a short period of time and then to use formula (23). By definition, the measurement result is a momentum averaged over a time period At. Let us suppose, that a set of such measurements is performed upon an electron ensemble described by a wave function ifi. A number of momentum values, described by a spread Ap, is obtained. Generally, the spread (uncertainty) of momentum depends on the wave function before the first position measurement, but in any case due to the uncertainty principle |4 " |2 M (24) where |Aq| is the distance between the position of the electron in the first and second measurements of its position. As \Aq\ < c At, where c is the speed of light then it follows from (24) that |Ap|> cAt Thus, unlike classical mechanics a reduction in the time of measuring moments m leads to an increase of momentum uncertainty independent of the form of the wave function, which has described the electron ensemble before measure- ment. Although nothing prevents a set of measurements of momentum being averaged over a short period of time At, one cannot assert that the resulting momentum values represent those of an electron ensemble described by some wave function. These momentum values cannot be attributed to the wave function, which described the ensemble directly before measurement, because the measurement result depends on the measurement duration At (with Af-»0). The momentum values cannot be attributed to the wave function arising during measurement, because, however short the period is, the wave function changes in this period essentially, and that moment, to which the measured values should be attributed, is unknown. Thus, although a measurement can be made, the results cannot be connected with any wave function, and, hence with the statistical statements of quantum mechanics. Let us consider the measurement of momentum averaged over a long period of time T. Let there be a generator localized in some region ft with linear dimension of order Aq. The generator prepares an electron ensemble in a state described by a wave function ty. Let the electron be detected at a distance q from the generator in time T after switching the generator on (\q\ »Aq). Let us assume, by definition, that the average of the electron momentum over a period Rylov 123 T is measured. It is defined by mq (25) Essentially the relation (25) is the relation (23) which is used for the case when T is much more than the characteristic time of evolutional change of the wave function. The inaccuracy Ap of momentum measurement is determined by the relation Ap = mAq (26) For a fixed generator size the inaccuracy is less the longer the measurement time. I shall call momentum defined by relation (25) q-momentum (from the word 'quantum') in contrast to the momentum defined by the relation (23) with Af-»0. 1 call the latter c-momentum (from the word 'classical'). In the case of the absence of an electromagnetic field the q-momentum distribution can be determined by performing a set of single measurements of q-momentum. This distribution is determined by the relation W(p)dp=|(Ap| 2 dp (27) where W(p) dp is the probability of measuring a momentum p within the region dp = dp! dp 2 dp 3 and <A» = (2nh) 375 -ipq/ft «Mq, t) dq (28) is the Fourier-component of the wave function. It should be noted that the measurement of q-momentum (25) is reduced to a position measurement at the moment t = T. The wave function of the free electron evolves in such a way that for t long enough the form of \tf/(q, t)\ determines |iApW| • I shall show this in the simple example of one-dimensional motion. Let the wave function at the initial moment have the form V2<n-fc J c ipq/ % dp (29) *p = \-jrh exp r A 2 (p-p ) 2 ] I 2 ft 2 J (30) where A is a constant representing an effective width of the wave packet, p is the mean value of the q-momentum. According to (27) and (30) the probability of detecting q-momentum p within the range dp which is W(p) dp = fcf d P = -A= exp { - A JP_^2L} dp Mir (31) 124 Uncertainty Principle and Foundations of Quantum Mechanics has a normal distribution form with mean value p and dispersion h /(2A ). At the initial moment the packet centre was found at the point q = 0. With the passage of time the wave function evolves according to Schrodinger's equation. It turns at the moment t into <Jt(q,t) = - 1 :J ^/ftM^r/am*)^ dp (32) Substitution of (29) onto (32) and calculation yields the result / A [ (q-pot/m) ippt ip g| * ( *°" V 2 +ito/mh/^ eXP l 2A 2 (l + /WA 2 m) 2mh h J (33) Hence, one gets for the probability d W of detecting the particle in the vicinity of the point q within range dq at the moment t dW=\+(q. tf dq = V ^AVftV/m 2 ) CXP r^TWJ^i dq m For / »mA 2 ft _1 the coordinate distribution reproduces the initial distribution (31) over q-momenta. Indeed in accordance with (25) assuming Pt mq and substituting q from (35) into (34), one gets A f A 2 (p,-poV h 2 B W(p,) dp, = T7 ^ : exp [ - t2n2 h-SirB }dp, where B -fc 2*4 m A ..2,2 n t (35) (36) (37) If f->oo, then B-*\ and the q-momentum distribution, obtained from the coordinate distribution at the moment /, coincides with the distribution (31). It follows from the example, that the q-momentum distribution subsequently turns into a coordinate distribution and can be measured. It should be noted that measuring |i^ p | 2 in such a way, we have no right to assert that the momentum distribution is measured in any definite state. The fact is that the wave function has changed essentially during measurement time. At first it has been localized in the vicinity of the generator and then it spreads over space. By measurement one obtains only time-independent characteris- tics of the wave function such as amplitudes |^p| . Thus the above manner of q-momentum measurement does not permit the measurement of momentum or momentum distribution in the state described Rylov 125 by a wave function. The best that can be measured is the momentum distribu- tion averaged in some way over states with different wave functions. In the end it is connected with the fact, that due to the uncertainty principle the precise measurement of momentum needs a long time in which the wave function changes essentially. Let us consider momentum measurement based on the law of the conserva- tion of momentum. For the measurement of the electron momentum a particle of mass M is placed in the electron path. By collision with the particle the electron is captured by the particle and passes its momentum to it. In measuring the particle momentum, by definition, one measures the electron momentum at the instant of impact. For the electron momentum to be measured with an accuracy Ap, it is necessary that the initial momentum of the particle is of the order Ap. According to the uncertainty principle this is possible only if before collision the particle is placed within a region with line size of the order Aq and Aq- h_ Ap (38) For simplicity I shall consider only one dimension. Let the wave function describing an electron ensemble having the form of the wave packet (32) with spread L. The time uncertainty of hitting an electron with a particle is determined by the relation m Atsi-(L+Aq) P (39) where m is the electron mass and p is its momentum. If one measures the momentum long enough it can be determined with great accuracy. Thus, the uncertainty of electron momentum measurement is conditioned only by the uncertainty of the initial momentum of the particle. For the measured electron momentum to be attributed to any definite wave function, it is necessary that the wave function changes slightly during the measurement time At. In the optimum case the spread dp of i/f p in the region of the variable p is determined by the uncertainty relation 8p: h L (40) During the time At the phases (f> " = { Pq -^)/ h (41) corresponding to unlike unvanishing Fourier-components of t// p change. The greatest phase difference arising during At is A<p = pSpAt (42) 126 Uncertainty Principle and Foundations of Quantum Mechanics Substituting (39) and (40) into (42) one gets A? a 1 (43) This means that the wave function always changes essentially during the measurement time, and the measured values of momentum cannot be attri- buted to any definite wave function. They can be attributed only to an ensemble state averaging in some sense over the period At. In this sense the q-momentum distribution cannot be measured in this way. The reason preventing this can be formulated as the time-energy uncertainty relation. Indeed, the relations (42) and (43) can be formulated as Acp h 1 (44) where AE is the uncertainty of the electron energy. Let us consider momentum measurement based on the Doppler effect. Let an atom in an excited state radiate a photon with frequency w . If the atom moves, then the photon, radiated into the direction of motion, has a frequency —o( 1+ 7) (45) where V is the atom velocity. Measuring the photon frequency, one can determine the atom velocity from (45) and hence the atom momentum. The least error Aw of the frequency determination is given by the relation A<w; 1_ T (46) where T is the measurement time, i.e. the period during which the objective of the spectrometer being used for the photon frequency measurement is open. Let the atom ensemble be described by a wave function, having the form of a wave packet with spread L. Again for simplicity the one-dimensional case is considered. Supposing that for the atom speed V« c, one gets for the uncer- tainty Ar of the photon radiation time At = T+- (47) Caculating the atom velocity by means of (45), one obtains the following expression McAw Mc , A0 . Ap = MAV= a — - (48) O) 0)0 1 for atom momentum error measurement. Let us write the wave function of the atom in the form *<«-' >= ;/ss! e " ipq m-(iE^m. dp (49) i I Rylov 127 where E p is the energy of the atom with momentum p. At the moment of photon radiation the atom energy reduces by tuo. Photon radiation can be produced at any moment of the period At. This entails a phase uncertainty A<p p of one of the Fourier-components iff p , which is determined by hcoAt (o (50) Thus the phase difference between different i// p changes essentially in the period At. This means, that the wave function changes essentially during measurement time At. Thus, the measured momentum cannot be attributed to any definite wave function, i.e. the momentum cannot be measured in the sense that it is customary to treat measurement in quantum mechanics. The best that can be done is to state that the measured momentum distribution is attributed to a set of wave functions having constant modules \ty p \ of Fourier-components and indefinite phases. Statements of such a kind are absent in conventional quantum mechanics. Apparently they do not contradict quantum mechanics, but one cannot say that they confirm it. The distrubition |e/f p | 2 over q-momenta can be obtained, because it does not involve any wave function describing a free particle, but this is not the measurement of momentum distribution. The restriction of momentum meas- urement is born not from the impossibility of measuring momentum but from the impossibility of attributing measured values to any definite state. All the ways considered of measuring momentum fit the case when the particle motion obeys the laws of classical mechanics. In other words the method of measurement does not depend on the model which is used for the explanation of physical phenomena. In classical mechanics also it is necessary to attribute the measured value to a definite state but in this case there are no wave functions and no uncertainty principle. It is values of coordinates and momenta that determine a state. For this reason in classical mechanics the problem of attributing measured values to a definite state does not exist. Let us consider the measurement of the component pi of the momentum of a free particle at the point q. Let the free-particle ensemble be described by means of a wave function *//. Let us measure the particle position at the moment t + T. In principle this is possible. Let us select only those single measurements which have given the particle position in a small vicinity of position q at the moment t. Let these single measurements constitute the ensemble E q . Suppose further in these cases that the particle position measurement at moment t + T gives a result q+Aq, where Aq is generally different for different single measurements. Let us suppose that each single measurement determines the component p x of the particle momentum at the point q. It is determined by the relation A* 1 P\ ^ - m- (51) 128 Uncertainty Principle and Foundations of Quantum Mechanics Performing many single measurements, one gets a distribution of the momen- tum component p x at the point q. This distribution depends essentially on the choice of the period T between two consecutive positions of the particle. The shorter the period T the more precise the determined position of the particle is and the more energetic the beam of sounding particles (electron or photon) to be used for the particle position determination. As a result the particle motion will be disturbed and the distribution over component p, will be distorted. The shorter the period, the greater is the momentum dispersion of the particle. This means that the distribution over component p t of the particle momentum cannot be measured. The criterion of such an impossibility is the dependence of the distribution on the period T as T-*0. Let a particle position measurement be realized by means of a beam of sounding particles (for instance, electrons or photons). Let us take the beam to be directed normally to the first axis along which the momentum component is to be measured. One can expect that in this case the sounding particle beam does not influence the value of momentum component p\ on the average. This means that the mean value <pi) q of the momentum component at the point q does not depend on the sounding particle energy. The formal criterion of this is the existence of a limit The problem of the existence of the limit (52) can be investigated by means of a quantum mechanics formalism. I shall not do this, but shall confine myself only to the optimistic supposition that such a limit exists. The existence of the limit (52) for all points q and a certain ensemble of free particles, described by the wave function t//, means the possibility of measuring the mean value (pi), of the momentum component at the point q at the moment t and to attribute it to the state descibed by the wave function t/t. Using proper sounding beams one can measure mean values of other momentum components at the point q. A measurement possibility of the mean value <pi) q of the momentum at the point q in the state ip means a measurement possibility of mean angular momentum <[q x p]) q and other mean values, which are linear over the momentum components of a free particle. Measuring electron position one is able to calculate a distribution \4>(q)\ over coordinates for all moments of time. This permits the calculation of all moments (q l ) (I = 1, 2, . . .) of the electron coordinates and their time depen- dence (for simplicity the one-dimensional case is considered). Using Schrodinger's equation for the free particle one can show that the moments (p ) (/ = 1, 2, . . .) of momentum are expressed by the relations </>'> = 17 ^r<<A '=1,2,... (53) Rylov 129 (/ = 1, 2, . . .) and establish momentum distribution for all moments of time. As long as free-particle energy is expressed through particle momentum, then the formula (53) permits the calculation of the energy distribution for each moment of time. All these distributions can be attributed to a definite moment of time and consequently to a definite wave function. However the essential problem consists in whether or not the foregoing procedure is a measurement of momentum distribution. Although to a degree this is a question of terminology it is usually taken that momentum distribution measurement is a procedure such that a certain value of momentum is obtained as a result Of each single measurement. A set of all measured values on momentum constitutes a moment distribution. The described procedure is not one of such a kind. Here the result of a single measurement is a certain value of the coordinate. For this reason I shall not take this procedure as a momentum measurement. Let us consider the angular momentum measurement in the experiment of Gerlach and Stern (1924). The detailed analysis of this experiment can be found in any textbook on quantum mechanics (see, for instance, Pauli, 1933; Blokhintsev, 1963; Bohm, 1965). I shall confine myself to only the analysis of to what extent this experiment proves that an angular momentum projection upon a certain axis takes the ft-fold values. Let there be an atom with an electron shell with a non-zero momentum M resulting from the orbital motion of the electrons. The electron spins are supposed to be compensated. The magnetic moment (i is connected with the angular momentum M by means of the relation #* = eM 2mc (54) where m is the mass of a particle. As long as moments (q ) can be measured at all moments of time then in principle one can calculate all momenta (p ) where e is the electron charge and m is the electron mass. A beam of such atoms passes between the poles of a magnet with a very inhomogeneous field, with field magnitude gradient directed along the magnetic field. In the magnetic field the atom obtains an additional energy and is affected by the force F=-V(|iH) (55) Passing through the magnetic field the atoms move normally to the lines of force. Under the action of the force (55) during motion through the magnetic fields, the atoms obtain a momentum in the direction of the force F, i.e. in the direction of the magnetic field. This momentum is different for different values of magnetic moment projection [i H upon magnetic field direction. As a result the atom beam splits up into several beams depending on the magnetic moment projection fi H . After some time the beams are separated in space. Each beam can be recorded by its dropping into a definite place on the screen. If the place of dropping is known, then one can calculate the corresponding value of the magnetic moment projection fi H and the value M H of the angular momentum projection upon the magnetic field direction. Experiments show that measured 130 Uncertainty Principle and Foundations of Quantum Mechanics in such a way the value of the angular momentum projection is ft-fold. Such is the conventional interpretation of the Stern-Gerlach experiment. The discreteness and the multiplicity of h of the angular momentum projection M H are explained by those of eigenvalues of the angular momentum operator (Bohm, 1965, Chap. 14). On the other hand, if the stationary states of the atom are discrete and each state obtains an additional energy AE, = A£,(H) (56) in the magnetic field, then independent of the nature of this energy change of the stationary state the atom, placed in the inhomogeneous magnetic field, in the ith state is affected by the force F, = -V(AE,) (57) If the force is different for unlike discrete stationary states, then under its action the atom beam is split into several beams according to different stationary states. Thus, splitting into discrete beams is connected with discreteness of the atomic states and the difference of their energies in the external magnetic field. Strictly speaking, it is the energy of the atom in the external magnetic field, that is measured in the Stern-Gerlach experiment. This was noted by Pauli (1933) and Blokhintsev (1968). The discreteness of the angular momentum projection M H results from the fact that the operator M H commutes with the Hamiltonian and, hence, its eigenvalues can serve as a label of stationary states. One can see from analysis of the atom beam motion in a inhomogeneous electric field that it is the second interpretation that is correct. If the atom is placed in an electric field E then the energy of its stationary state changes a little. This is the so called Stark-effect. For simplicity let us consider hydrogen. It is known (see, for instance, Landau and Lif shits, 1963, section 77) that for not too large an electric field the change of the stationary state energy AE is linear with the electric field E. As a first approximation of perturbation theory one gets for the fth undisturbed state AE, = [D1-E (58) where [D], means some quantity which depends on the undisturbed Hamilto- nian eigenvectors and electric dipole operator D, but not on electric field E. For some stationary states of hydrogen the [D], is non-zero. The hydrogen atom beam, moving in the inhomogeneous electric field E, is affected by the force F = -V([D]E) (59) which is different for atoms in different stationary states. As a result of the motion in a proper electric field the beam is split up into a few discrete beams. From a measurement of the beam deflection one can calculate values of [D],E which take a set of discrete values. At the same time the electric dipole operator D has the form D = e(q p -qe) (60) i Rylov 131 where q p and q e are position operators of a proton and a electron respectively. Components of operator D commute with each other and have a continuous spectrum of eigenvalues. Thus the interaction operator (-/*H or -DE) responsible for beam splitting has a discrete spectrum in one case and a continuous one in another case. But splitting into discrete beams is produced in both cases. This means that the discreteness of the beams is produced not by the spectrum discreteness of the interaction operator, but by that of the whole Hamiltonian, i.e. by discreteness of the stationary states of the atom. This means that the Stern-Gerlach experiment cannot be used for testing the quantum mechanics statement which asserts that by measuring the quantity M H , the operator of which M H has a discrete spectrum, only those values can be obtained which are equal to the eigenvalues of operator M H . Thus, in the Stern-Gerlach experiment a sorting of stationary states is produced. It can be considered as the measurement of angular momentum provided values M H are labels of the states. Let us analyse to what extent the measured result can be attributed to a definite wave function. Suppose in the Stern-Gerlach experiment that the wave packet describing the atom ensemble moves in the positive direction of the x-axis, as is shown in Figure 1. First of all the wave-packet spread in the Atom beam Screen Figure 1 Stern-Gerlach experiment with the wave packet moving in the positive direction of the a: -axis. x -direction has to be essentially more than the size of the apparatus, otherwise during the time T of the wave packet passing through its spread, L x , the phase difference (42) in the exponent of formula (49) will change within limits A = A£r ^ P* A P* T ~ P* A P* ML * = L * A P* - i 9 h Mh ~ Mh p x h (61) where p x is an atom momentum x -component, i.e. the wave function has time to change during the experiment which cannot be shorter than T. 132 Uncertainty Principle and Foundations of Quantum Mechanics Let the wave packet size L x be much more than the apparatus size l x , and the momentum uncertainty of the atom Ap x along the x-axis be small. Let L z be the wave-packet spread in the direction of the 2 -axis. During passage between the poles of the magnet the atom acquires a momentum Pz = V-H- where dz Ml, (62) is the interaction time of the atom with the magnetic field, i.e. transit time. For beam separation it is necessary that p z >Ap z >— J-'z (63) where Ap 2 is the uncertainty of the momentum of the atom along the z-axis. Atoms with different magnetic moment projection have different energies. This conditions the change of phase difference of the different Fourier- components of the wave function which correspond to different values of fi H . At best during the time T the phase difference change is A(p = A£T h 1 dH T ^ -Hh—L z T n dz Substituting T from (62) and using (63), one gets A<p>l (64) (65) This means that during the measurement time the wave function changes essentially and the measurement result cannot be attributed to a definite wave function. Thus, it follows from the analysis that quantities such as momentum, energy and angular momentum cannot be measured in the sense that measured values cannot be attributed to definite ensemble states. This means that quantum mechanics statements cannot be verified experimentally for all physical quan- tities R, because for this test it is necessary to attribute the measured value to a definite state. Statistical statements (8) can be verified for R=F(q), and perhaps for R=p, but there are those that cannot be verified, for instance, for R=p\ Thus the statistical statements of quantum mechanics can be verified only in particular cases and cannot be verified in general. This is the corollary of the time-energy uncertainty principle. In principle, this is connected with the non-relativistic character of quantum mechanics, according to which a wave function is given at one moment of time. This is in contradiction to the time-energy uncertainty principle which requires that an esemble description is 'spread over time'. Indeed, in quantum mechanics the particle description by Rylov 133 means of a wave function makes it 'spread over space', while relativistic symmetry requires it 'to spread it more over time'. If statistical statements of quantum mechanics cannot be tested experimen- tally then their consequences remain doubtful: for instance, von Neumann's theorem on hidden variables or the statement that in the measurement of a quantity R only values equal to eigenvalues of the operator R can be obtained. For instance, as we have seen, the Stern-Gerlach experiment does not prove at all that angular momentum takes only ft-fold values. At the same time some consequences of formula (8) may be correct, even if it is not always valid. At least it is possible to consider formula (8) as correct because it has not been proved that it is incorrect, but only that it can not be proved experimentally. At the same time I believe an alternative conception would be welcome which would explain the impossibility of making measurements of quantities having energy-momentum character and which would, in general, consider as observable only those quantities which could be measured experimentally. 3. IDEAS OF RELATIVISTIC STATISTICS Let us try to have a look at quantum mechanics from another viewpoint. Let us imagine that the motion of a microscopic particle is not deterministic (for instance, because of its interaction with the surroundings), i.e. its behaviour is like a Brownian particle. This means that the particle's world-line appears to be strained in a random way. Let us suppose that a statistical description of such non-deterministic world-lines can lead to the same results as those obtained by quantum mechanics. Of course, it is hopeless to try to obtain all basic statements (l)-(5) of quantum mechanics because on account of the von Neumann theorem on hidden variables they are certainly inconsistent with the supposition that a particle is described by means of a definite world-line (see, Moyal, 1949). As we have seen, not all statements of quantum mechanics can be tested experimentally. For this reason some hope remains for the successful formulation of world-line statistics in such a way that disagreement with statements (l)-(5) will occur only within an unobservable field. It should be noted that attempts to interpret quantum mechanics from a classical or quasi-classical point of view are numerous. They are known as hidden-variables theories. A review of different versions of such theories and their bibliography can be found in the survey by Kaliski (1970) and in a monograph by Belinfante (1973). In most cases such theories represent attempts to interpret the basic statements of quantum mechanics from different viewpoints. Attempts to obtain the results of quantum mechanics starting from classical statistics in their pure form have not succeeded as a rule, because non-relativistic statistics have been used. This is motivated by the fact that quantum mechanics is also non-relativistic. As far as possible I shall confine myself consequently to the relativistic viewpoint. First of all it is necessary to differentiate between non-relativistic 134 Uncertainty Principle and Foundations of Quantum Mechanics and relativistic notions of state. The non-relativistic state (n-state) of a system is a set of quantities given at a certain moment of time. For instance, the particle n-state is determined at a certain moment by coordinates q and momenta p, i.e. by a point in the phase space of coordinates and momenta. In non-relativistic physics the division of physical phenomena descriptions into state and equa- tions of motion, is connected with the existence of absolute simultaneity and the existence of an invariant division of space-time into space and time (two invariants: time period and distance). Correspondingly, a particle is considered as a point in the three-dimensional space. The relativistic state (r-state) is given over all space-time. For instance, the particle r-state is its world-line, described by the equation q=q'(T) (i = 0, 1, 2, 3). t is a parameter along the world-line. Equations of motion play the part of restrictions imposed upon possible r-states. In relativistic physics the united description (without division into state and equations of motion) is connected with the absence of an invariant division of space-time into space and time (the only invariant: interval of space-time). Correspondingly, a particle is considered as a one-dimensional line in space-time (but not a point in space). In non-relativistic physics a statistical method is used for the descriptions of the non-deterministic system (i.e. systems with uncertainty in equations of motion or systems with uncertainty in initial state). One considers a statistical ensemble, i.e. a set of many identical systems which are in different states. The dynamical systems constituting the ensemble are known as elements of the ensemble. The statistical ensemble is a deterministic dynamical system even if constituting systems are non-deterministic. This means that the ensemble n-state can be calculated at time t if its n-state at time / (fo< is determined. For example, let a dynamical system A consist of a non-deterministic particle. The ensemble consists of N such independent particles (N-»oo). The state of every particle is represented by a point in the phase space. Let dft be an element of volume of phase space, and let dN be the number of points in dft. Then dN=Wdil (66) where W = W(q, p) is a state density. Although any individual system of an ensemble is indeterministic, it is found that the evolution of an esemble n-state W can be calculated because the W obeys some equation the form of which depends on the character of random forces, acting upon particles of the ensemble. In the case, when systems A constituting an ensemble are deterministic, the form of the equation which is obeyed by W is determined uniquely by means of the equation of motion of system A. Thus, the W is the n-state of the statistical ensemble as a dynamical system. The equation which is obeyed by W is invariant with respect to the transformation W-> CW when C is a constant. This is so because the ensemble behaviour does not depend on the number of systems constituting the ensem- Rylov 135 ble, if this number is large enough. The constant C can be chosen in such a way that W(q, p) represents the probability of detecting the n-state of a particle in the volume dft of the phase space. That the ensemble n-state is a probability density is connected with the representation of the state of the system by a point (but not a line or surface) in the phase space. As W(q, p) is the probability density of the detection of the physical system in the state (q, p), then calculating W(q, p) at any moment t by means of the equation of motion of the statistical ensemble, one can calculate the evolution of the mean value (F(q, p)> of any function F of the n-state (q, p). I have described in general the conventional scheme of the statistical ensemble application for describing the behaviour of the non-deterministic systems. Three essential points in this scheme should be stressed. (1) The transition from a physical system to the ensemble, i.e. the method of construction of the ensemble state. (2) The determination of equations which are obeyed by the ensemble state, and the solution of these equations. (3) The transition from an ensemble state to an individual system, i.e. the method of calculation of the statistical characteristics of the non- deterministic system proceeding from an ensemble state. In the generalization of the statistical method to the relativistic case different variants are possible. I shall consider three of them. (1) In the first variant the base statement asserts that in the relativistic case the ensemble state is represented by a probability density. For this it is necessary that the ensemble state is represented by a point in a phase space. To reach this the particle description by means of the r-state (world-line) is dropped and one considers the intersection points of the world-line with different three-dimensional surfaces or points marked by parameters such as proper time (Hakim, 1967a, b, 1968). Such points together with momenta represent a state of a particle, as in non-relativistic physics. Requirements of relativity are taken into account imposing Lorentz-invariance conditions upon corresponding equations. Such an approach is most widely accepted, but I shall not use it. I believe that using the world-line as the main subject of a theory is far more important than the application of the developed formalism of probability theory. (2) The second method consists in considering that any particle r-state (world-line) is a point in a certain functional phase space M, the r-state being kept as the basic subject of the theory and the ensemble state being a probability density on M. However, if I is an intercept of the world-line located within a region ft of space-time, then the question of which region on M corresponds to ft will be solved depending on the behaviour of the world-line outside ft. This means non-locality of description. Such a description seems to me unsatisfactory. 136 Uncertainty Principle and Foundations of Quantum Mechanics (3) The method, which I shall use, keeps the r-state as a basic subject of theory. It realizes a local description, but the ensemble state, i.e. the density state of systems constituting the ensemble, cannot be treated as a probability density. For instance, let there be an ensemble of systems consisting of one particle, i.e. an ensemble of world-lines. Let ds, (i = 0, 1, 2, 3) be an infinitesimal area at the point q and AN be the number of world-lines crossing ds*. Then AN-- i=0 (67) where / is a factor which is, by definition, the density of r-states (world-lines) at the point q of space-time. The/' considered at a certain moment of time is the ensemble n-state. The same /' considered in the whole space-time (or in some region) is the r-state. In this sense the r-state of the ensemble coincides with the n-state. In the case, when the ensemble elements are dynamical systems consisting of N particles, the r-state of such a system described by an N-dimensional surface on 4JV-dimensional space Vi 2 ..jv = Vj <8> V 2 ® • • • ® Vjv which is a tensor pro- duct of spaces Vj (i = 1, 2, . . . N) for each of the particles (see, for instance, Hakim, 1967a, b). The state density of such systems is described by the antisymmetrical over all indices pseudotensor j" 1 " 2 "" N on Vi2..jv (fli, a 2 , ,,,a N = \,2, ...4N). Later on for simplicity I shall confine myself to the case, when the ensemble element is a dynamical system consisting of one particle. The method of transition from the system r-state to the statistical ensemble is defined. It coincides with the conventional method: the ensemble state is the state density of systems constituting the ensemble. The reverse transition from the statistical ensemble to the properties of the system cannot coincide with the conventional one, because the conventional method is based on the fact that an ensemble state W(q, p) is the probability density of detecting a system in the state (q, p). Strictly speaking, neither /' nor j" 1 "* " '" N can be treated in such a way. For this reason the transition from the statistical ensemble to the non-deterministic system is based on the use of additive quantities. Definition. The quantity B is an additive one if the value of B for several independent systems is equal to the sum of values of B for every system. Energy, momentum, angular momentum and their densities are examples of additive quantities. The statistical ensemble is a set of independent systems. For this reason any additive quantity attributed to the statistical ensemble as a dynamical system is a sum of values of this quantity for all systems constituting the ensemble. As the ensemble behaviour does not depend on the number of systems in the ensemble, the equations for the ensemble state / are invariant with respect to transformation / -*■ Cj, where C is a constant and / denotes any ensemble state: W, j' . . . . Hence, /' can be normalized on one system. In this Rylov 137 case a value of any additive quantity of the ensemble is equal to the mean value over the ensemble of this quantity for systems constituting an ensemble. In the non-relativistic approximation, when one of the components of j is non- negative and conserved (for instance,/ for a one-particle system), it is possible to treat this component as a corresponding probability density. In this case it is possible to obtain additional information. Let us formulate the above in the axiomatic form. The statistical principle A deterministic dynamical system, which is called a statistical ensemble corresponds to a non-deterministic* dynamical system A whose state is described by quantities £ (1) A state / of the statistical ensemble is a density state of systems A. (2) The equations for the ensemble state j are invariant with respect to transformation ;' -> Cj, C = constant (68) (3) If the ensemble state / has a proper normalization (on one system), every additive quantity B attributed to the statistical ensemble as a dynamical system is the mean value of quantity B for system A. The statistical principle fits either the relativistic or non-relativistic case. It settles the question about the determination of the ensemble state and about the determination of the non-deterministic system properties, but it does not determine which equations are obeyed by the ensemble state. However if the ensemble elements are deterministic systems, then the equation satisfied by an ensemble state ; is determined by equations which a single ensemble element obeys. This fact can be used to simplify derivation of equations which the ensemble state / obeys. I shall show this by a simple example of an ensemble E the elements of which are free particles of mass m, i.e. all possible timelike straight lines in space- time. Although it is possible to introduce a density state for such an ensemble by means of (67), it is still not possible to describe the ensemble completely. This means that if the state ;"' is given at any moment t, then, in general, the /" cannot be determined at another moment. The state of such an ensemble can be described by a distribution function f(q,p), q = {q ,<l r ,<l 2 ,<l\ P = iPo, Pi, P2. P3} which satisfies equation (69) T-r(p'f(q,p)) = dq (69) where p l is 4-momentum of a particle. Here and later on summation is made on *The statistical principle can be applied to the deterministic system, if its initial conditions are not exactly determined. 138 Uncertainty Principle and Foundations of Quantum Mechanics like arabic super- and subscripts from zero to three. The distribution function f(q, p) vanishes except over the surface (70) ik 2 2 Pig Pk=m c where g ,k is the metric tensor c 2 0-10 0-1 gtk Ik g = C -1 -1 0-1 (71) 0-1 o amd m is the mass of a particle. The meaning of the distribution function is determined by the fact that the stream density;" of world-lines at the point q is /' = }(p' l /m)f(q,p)d 4 p, d 4 p = dp dp 1 dp 2 dp 3 (72) However, instead of the description by means of a distribution function one can use the following. Let us consider the ensemble E as consisting of elements E p , where E p is an ensemble whose elements are straight lines having the direction of the unit vector Pi/mc. The ensembles E p are completely described by the pseudovector y '. I shall call such an ensemble a pure one. As long as the E p are dynamical systems, they can be elements of the ensemble E. It is reasonable to normalize the states y" of the E p in a similar way. It is easy to verify that the equations which the j'(q) satisfy have a form dq ^j gskj ' = 0,1,2,3 Let us normalize the y' on one system by means of /' ds, = 1 L (73) (74) where 1 is an infinite space-like hypersurface and ds, is an element of the hypersurface. All physical quantities attributed to a pure ensemble as a dynamical system can be expressed through the state j'. In the given case, when all ensembles E p can be labelled by parameters p h the state of ensemble E can be described by means of non-negative quantities W(p) which represent a probability density of detecting a pure ensemble E p in the ensemble E. Finally, here all can be reduced to a distribution function fi<l,p)- However, in the case, when non-deterministic one-particle systems are considered the equations which the pure ensemble state satisfy can be more complicated than equations (73), and the possibility of labelling all their solutions by means of the finite number of parameters is not evident. For example suppose it is insufficient for the description of a single system to give coordinates and momenta, but it is necessary to give quantities q, q, q, . . . Rylov 139 where q = {q°,q l , q 2 , q 3 } and the dot denotes differentiating in proper time t. For example, such a situation arises in the consideration of Lorentz-Dirac equations. In this case the distibution function f(q,q,q) is used (Hakim, 1967b). In the non-relativistic case the more general consideration can be found in the book by Vlasov (1966). In place of the consideration of distribution functions of the type f(q, q, q, . . .) one can consider an arbitrary ensemble E as an ensemble in which the elements are pure ensembles Ej described by means of /' with y" being a function of q only. The dependence of the ensemble E on distributions over q,q, . . . manifests itself as the implicit dependence of E on Ej. This has been shown in a simple example of the distributive function f(q, p). The considera- tion of the ensemble E as consisting of elements Ej has the advantage that, in general, one cannot be interested in how many derivatives there are like q, q\... and in which way the distribution function depends on them. AH that it is necessary to know are the equations which the state of the pure ensemble satisfies. The state of an arbitrary ensemble satisfies equations which can be obtained merely as a result of a formal transformation of the pure ensemble equations. Just this fact explains the mysterious circumstances that the quan- tum mechanics pure state (wave function) depends only on coordinates, while the usual classical distribution function depends on coordinates and momenta. The dependence on momenta (and not only on momenta, but, in general, on quantities like q, q, . . .) is taken into account in quantum mechanics in consid- eration of mixed states, i.e. ensembles whose elements are pure ensembles. In the general case the state of an ensemble E is described by a function Wof states y ' of pure ensembles Ej which are elements of the ensemble E. The state of each pure ensemble Ej is completely described by four functions y" (i = 0, 1, 2, 3). This means that y' satisfy certain equations, and that all physical quantities are expressed through /'. In the case in which the state W of the ensemble E is considered as an r-state, the W should be considered as a function of four quantities/' (i = 0, 1, 2, 3) which are considered as functions of space coordinates q and time t. As it is necessary to use the relation derived using non-relativistic statistics, it is convenient to consider the state W of the ensemble E as the n-state, determined at any moment of time t. In this case the n-state of ensemble W is a function of four quantities y" 0=0,1,2,3), considered as the function of the space coordinates q. Besides the W is a function of time t. It is supposed that, if for a pure ensemble Ej the state /" (i = 0, 1, 2, 3) is given at moment t, then it can be determined at all subsequent moments of time. The dependence of the n-state W[j'] on time can be determined from the relation W,[j t ] = Wb[/r=o], y = {y°, y 1 , j 2 , j 3 } (75) The index 'f ' of W, and j, shows the moment at which these values are taken. The relation (75) expresses the fact that the change of form of the functional W[j] is conditioned only by a change of functions /' which is described by the equations of motion. Let the equations of motion of the n-state of pure 140 Uncertainty Principle and Foundations of Quantum Mechanics ensemble E s have a form '^ *-{£•£•£}' / = °' 1 ' 2 ' 3 (?6) C df ^''^ a =tvW K ' =0 ' 1 ' 2 ' 3 G' are some functions of/ and their derivatives. Differentiating the (75) with respect to t and using the expressions (76) for dj'/dt, one gets the following linear equation with variational derivatives. ar 3 r xiy i=o J o; W (77) Equations of motion (76) of a pure ensemble are characteristics of the linear equation (77). Let us apply the above consideration to an arbitrary ensemble of non- deterministic world-lines. In the paper (Rylov, 1971) in the non-relativistic approximation I considered a pure ensemble the elements of which are non-deterministic systems, each one consisting of one particle moving in a given electromagnetic field. It has been shown that it is possible to choose the equation of motion (76) of a pure ensemble in such a way that from statistical principles all basic results of one-particle quantum mechanics could be deduced. Namely, supposing that j a /j° (a = 1, 2, 3) has a potential and can be represented in the form ■75 = — T-^, « = 1,2,3 ; m dq it was shown that the function !/f = Vpexp|— j, p=/° satisfies the Schrodinger equation dt Y where 2m a = i V dq c / c (78) (79) (80) (81) A, (i = 0, 1, 2, 3) is the 4-potential of the electromagnetic field, e and m are the charge and mass of the particle, respectively. With proper normalization of the wave function (79) the rule for calculation of the mean values (R) of the quantity R has the form (8) for quantities representing momentum, energy, angular momentum and arbitrary functions of coordinates. The stationary states of an ensemble are described by a wave function which is an eigenfunction of the Hamiltonian (81). Rylov 141 Thus the difference of these results from the basic statements of quantum mechanics consists only in that (8) is fulfilled not for all physical quantities, but only for certain ones. There are differences just in those points that, as we have seen, cannot be tested experimentally. Now, using the statistical principle, I am going to generalize the results obtained for a pure ensemble of one-particle systems on the case of an arbitrary ensemble of one-particle systems. In the case, when restriction (78) is fulfilled the /' (i = 0, 1, 2, 3) determined by the wave function (79). In this case the n-state W of an arbitrary ensemble E can be considered as a function W[ip] of two independent quantities \fi and i/f* (i^* is the complex conjugate of iff). The relation (77) takes the form dW J_ r f dt ih) I 8W /#(*)- OW)* 8W )dx = (81a) * denotes complex conjugate, and H is the hamiltonian (81). As W[if/] represents the probability density of finding a pure ensemble described by the wave function i/r, then the mean value (R ) E of the quantity R over the ensemble E can be written in the form (R) E = \w[<l>](Rhd[<l>] (82) where the integral denotes integration over the whole functional space of the values of functions $ and iff*, and (R)^ denotes the mean value of the quantity R over the pure ensemble described by the wave function if/. All functions are supposed to be normalized by means of the relation «/r*(x)</r(x)dx=l (83) Let us choose an orthogonal basis {*,} in the Hilbert space of functions i//. Let the function <p be decomposed on this basis according to the relations (20) and (21), with decomposition coefficients a, = a,[iA] being functional of if/ and i//*. Due to (8) and (20) one obtains W = 2>fl>KL>]£* i,k where R ik is a matrix element of the operator R on the basis {*,} Kik = J*?(*)&k(*)cr* Substituting (84) into (82) one obtains (R) E = lR ik U ki = Sp{RU} i,k where the following notation is used (84) (85) (86) (87) 142 Uncertainty Principle and Foundations of Quantum Mechanics and U denotes an operator with matrix elements U ik on the basis \xi}. The operator U depends only on the state W[<l>~\ of the ensemble E but does not depend on the quantity R. The relation (86) coincides with (3), which describes the rule for the calculation of the mean value for an arbitrary ensemble. Multiplying (81a) by a j [<A]a*l>] and integrating over the functional space of functions i/» one obtains after transformations an equation describing the evolution of the matrix elements U ik of state operator U. It has the form ih d -E+UH-HtJ = dt (88) and is equivalent to (4). Thus, in the non-relativistic approximation one succeeds in showing that for an arbitrary ensemble of one-particle systems the basic statements of quantum mechanics (l)-(4) can be obtained starting from the statistical principle with the restriction that the calculation rule (86) for mean values can only be applied for arbitrary functions of coordinates and additive quantities: energy, momen- tum and angular momentum. The last restriction arises because the expression (84) for the (R)# has been obtained from (8) which is valid in this case only for those quantities R. Using the statistical principle, the results derived for a pure ensemble of two-particle systems (Rylov, 1973a, b) can be generalized to the case of an arbitrary ensemble. It is interesting to consider how the process of moment measurement described in the second section looks from the point of view of relativistic statistics (this I call the conception which used the r-state and statistical principle). Let us imagine an ensemble of usual Brownian particles (specks of dust), spaced in a moving gas. The gas velocity is supposed to be less than the thermal velocity of the gas molecules, and the mass of a speck of dust is supposed to be of the order of the mass of the molecule. The motion of the particles has a character of random wandering. If the gas motion is neglected then the mean square displacement of a particle during the time t with respect to its initial position has the form <(q-qo) 2 ) = 2£>f (89) where q is the initial position of a particle, and q is its position at time t. D is a diffusion coefficient. Defining the mean value v, during the period t by means of q-qo t (90) one finds that during the time t the mean velocity of the particle will be reduced as the period t is increased and it will tend to zero as t -* oo. It is conditioned by the fact that the root-mean-square displacement is proportional to Vf, as is seen from (89). If the gas motion is taken into account, then the contribution of the systematic motion of the gas to the root-mean-square displacement is propor- tional to ut, where u is the mean velocity of the gas. This contribution Rylov 143 dominates if t is large enough. As a result one of the gas streams takes the speck of dust away. The mean velocity during the time t will be different for different specks of dust, but, in general, if t -* oo it will not tend to zero but will tend to some value which is equal to the velocity of the gas stream taking the speck of dust away. The velocity of each element of gas is supposed to tend to a constant as t -> oo. The magnitude of the velocity of the speck of dust averaged over period t (with t -* oo) depends only on which gas stream catches that speck. Let us estimate a distribution of the dust specks over velocities using (90). If the time t is of the order of the mean free time then the velocity distribution will be a Maxwellian one with a temperature close to that of the gas. With increasing time t the distribution remains Maxwellian, but its temperature reduces until the root-mean-square velocity of the specks calculated by means of (89) and (90) becomes of the order of the gas speed. With further increasing of time t the velocity distribution of the specks is determined mainly by the gas-stream velocities. The dependence of the velocity distribution on the measurement time t is rather like that of an electron in the state described by a wave function. The difference is only in the absence of an agent taking the electron away as the gas does with the dust specks. In quantum mechanics the statistical ensemble (dynamical system!) plays the part of such a carrying agent. The conservation laws of energy and momentum are fulfilled for the ensemble. This prevents the electron momentum being dissipated and provides a constant motion of the electron in a definite direction. Thus, from the viewpoint of relativistic statistics the q-momentum measurement represents the measurement of the momen- tum of an electron averaged over a long period. Relativistic statistics permits one to obtain all the basic statements of quantum mechanics with the reservation mentioned above. It is curious that this conception does not contain anything typical for quantum physics. It is based completely upon principles of classical mechanics and classical statistics. Besides the motion of microscopic particles is assumed to be non-deterministic. Such an assumption in itself is not specific to quantum mechanics. It occurs occasionally in classical statistics, for instance, in the description of Brownian particles. The only essential supposition is that about a form of the term which is added to the Lagrangian of a pure ensemble of deterministic particles for the description of particle indeterminacy. This term has a universal form (it does not depend on particle characteristics) with Planck's constant being a coefficient before it. This is the only place, where Planck's constant ft is introduced into the theory. The introduction of such a term cannot be treated as a principle because some supposition about the character of the particle motion indeterminacy should be made in any case. Being in its form a classical (non-quantum) conception, the relativistic statistics means in no case a returning to classical mechanics. It gives a less detailed description of a dynamical system than quantum mechanics does and far less detail than in the classical case. It is a further step on the way to the restriction of detailed description in a microcosm. 144 Uncertainty Principle and Foundations of Quantum Mechanics Being essentially a relativistic theory, the relativistic statistics coincide with quantum mechanics only in a non-relativistic approximation. Linearity (equa- tion linearity, linear operators, the linear superposition principle), which is raised into a principle by quantum mechanics, is one of the main reasons for the advance of quantum mechanics. All this is conditioned by the non-relativistic character of the approximation. It is most curious that the conventional way of joining quantum theory with relativity, using wave functions, the linear superposition principle and so on, cannot be understood from the viewpoint of relativistic statistics. Accordingly to relativistic statistics nothing of that kind should be done. The non-relativistic approximation must not be used: the problem should be solved exactly. This claim of relativistic statistics does not seem a complete absurdity, if it is taken into account that in a consequent application the relativistic statistics contains a possibility of the generation of pairs i.e. particle-antiparticle (Rylov, 1975). The future will show how firmly the claims of relativistic statistics are founded. REFERENCES Aharonov Y. and Bohm D. (1961) 'Time in the quantum theory and the uncertainty relation for time and energy', Phys. Rev., 122, 1649-1658. Ballentine I. L. (1970) 'The statistical interpretation of quantum mechanics*, Rev. Mod. Phys., 42, 358-381. Belinfante F. J. (1973) A Survey of Hidden -Variables Theories, Pergamon Press, Oxford. Blokhintsev D. I. (1963) 'Foundation of quantum mechanics', Moscow, Leningrad (in Russian). Blokhintsev D. I. (1968) The Philosophy of Quantum Mechanics, Reidel, Dordrecht, Holland. (Original Russian edition 1965). Bohm D. (1965) Quantum Theory, Prentice-Hall, Englewood Cliffs, New Jersey. Bohr N. (1918) 'On the quantum theory of line spectra', Koniglige Danske Videnskabemes Selskabs skrifter Naturvidenskabelig og. mathematisk Aufdelung 8 Raekke, Bd. 4, 1, 1-1 18. Bohr N. (1928) 'The quantum postulate and the recent development of atomic theory', Nature, 121, 580-590. Einstein A. (1949) in Albert Einstein: Philosopher-Scientist, P. A. Schlipp, Ed., Library of the Living Philosophers, Evanston (reprinted by Harper and Row, New York, p. 665). Fock V. and Krylov N. (1947) 'On the uncertainty relation between time and energy', /. Phys. (U.S.S.R.), 11, 112-120. Fock V. A. (1962) 'On the uncertainty relation between time and energy and one attempt to disprove it', /. Experim. Theoret. Phys., 42, 1135-1139 (in Russian). Gerlach W. and Stern O. (1924) 'Uber die Richtungsquantelung im Magnetfeld', Ann. Physik, 74, 673-699. Hakim, R. (1967a) 'Remarks on relativistic statistical mechanics. I', /. Math. Phys., 8, 13 15-1344. Hakim R. (1967b) 'Remarks on relativistic statistical mechanics. II. Hierarchies for reduced densities', /. Math. Phys., 8, 1379-1400. Hakim, R. (1968) 'Relativistic stochastic processes', J. Math. Phys., 9, 1805-1818. Heisenberg W. (1927) 'Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik', Z. Physik, 43, 172-198. Heisenberg W. (1930) Die Physikalischen Prinzipen der Quantetheorie, Leipzig, Section II, 2d. Heisenberg W. (1955) 'Quantum theory and its interpretation', in Niels Bohr and the Development of Physics, London. Kaliski S. (1970) 'A tentative classical approach to quantum mechanics, Proceedings of Vibration Problems, 11, 3-17. Landau L. and Peierls R. (1931) 'Erweiterung des Unbestimmtheisprinzips fiir die relativistische Quantentheorie', Z. Physik, 69, 56-69. Rylov 145 Landau L. D. and Lifshits E. M. (1963) Quantum mechanics, Moscow, Section 77 (in Russian). Mandelstam L. and Tamm Ig. (1945) 'The uncertainty relation between energy and time in non-relativistic quantum mechanics,' J. Phys. (U.S.S.R.), 9, 249-254. Mandelstam L. I. (1950) 'Lectures on foundation of quantum mechanics (Theory of indirect measurements),' in Complete collection of proceedings, 5, 345-415. (in Russian). Margenau H. (1963) 'Measurement in quantum mechanics,' Ann. Phys., 23, 469-485. Moyal J. E. (1949) 'Quantum mechanics as a statistical theory,' Proc. Cambridge Phil. Soc., 45, 99-124. Neumann J. V. von (1932) 'Mathematische Grundlagen der Quantenmechanik,' Berlin. Pauli W. (1933) 'Die allgemeinen Prinzipen der Wellenmechanik,' in H. Geiger and K. Scheel Handbuch der Physik, 2nd ed., Vol. 24/1, Springer. Berlin, Chap. 2, pp. 83-272. Pearle P. (1967) 'Alternative to the orthodox interpretation of quantum theory', Am. J. Phys., 35, 742-753. Popper K. R. (1959) The Logic of Scientific Discovery, Basic Books, New York. Reece G. (1973) 'Theory of measurement in quantum mechanics,' Intern. J. of Theoret. Phys., 7, 81-117. Robertson H. P. (1929) 'The uncertainty principle', Phys. Rev., 34, 163-164. Rylov Yu. A. (1971) 'Quantum mechanics as a theory of relativistic Brownian motion,' Ann. Physik, 27, 1-11. Rylov Yu. A. (1973a) 'Quantum mechanics as relativistic statistics. I: the two-particle case,' Intern. J. Theoret. Phys., 8, 65-83. Rylov Yu. A. (1973b) 'Quantum mechanics as relativistic statistics. II: the case of two interacting particles,' Intern. J. Theoret. Phys., 8, 123-139. Rylov Yu. A. (1975) "The problem of particle generation in classical mechanics,' in Investigation of Cosmic rays, Moscow (in Russian, pp. 171-177). Vlasov A. A. (1966) Statistical Distribution Functions, Moscow (in Russian). Wick G. C, Wightman A. S. and Wigner E. P. (1952) 'The intrinsic parity of elementary particles,' Phys. Rev., 88, 101-105. Wigner E. (1963) "The problem of measurement,' Am. J. Phys., 31, 6-15. n© Uncertainty, Correspondence and Quasiclassical Compatibility JAN J. S£AWIANOWSKI Polish Academy of Sciences, Warsaw 1. INTRODUCTION There was no exaggeration in the famous metaphor of Sommerfeld comparing the correspondence principle with a magic wand. In fact, within the framework of the Old Quantum Theory this Principle was the only method which enabled physicists to evaluate such physical quantities as the intensities of spectral lines and the polarizations of atomic radiation. However the very nature and the internal logic of the correspondence itself remained unknown. Using the more fashionable terms of cybernetics we might say that the Bohr correspondence principle has worked like a black box. There were many such 'magic wands' in quantum mechanics; some of them strongly influenced the development of the theory, especially in its early days. There is nothing strange in this: quantum theory aims to describe microscopic phenomena which are so far from our everyday experience that when analysing them, all Newtonian intuitions break down. Therefore, from the point of view of philosophy based on Newtonian mechanics, all the famous quantum post- ulates were incomprehensible and could only be justified at the stage of final results. As a matter of fact, the whole of the old quantum theory consisted of mysterious magic wands. For example the theoretical status of the Bohr- Sommerfeld quantum conditions was completely inconceivable. The same is true of the Planck-Einstein postulates about the quantum granular structure of electromagnetic radiation. Similarly, the material waves as postulated by de Broglie, were more a mathematical idea than a physical picture in a classical sense. From the heuristic point of view two so-called 'principles' were of special importance namely the afore-mentioned correspondence principle and the Heisenberg uncertainty principle. Some historical comments are necessary here. The correspondence principle was formulated in the days of the old quantum theory. When formulating and developing the main ideas of quantum mechanics, Heisenberg, Born and Jordan referred explicitly to it (Heisenberg, 1925; Born and Jordan, 1925; Born, Heisenberg and Jordan, 1926). On the 147 148 Uncertainty Principle and Foundations ol Quantum Mechanics other hand, the uncertainty principle was derived by Heisenberg later in 1927, two years after he had discovered matrix mechanics (Heisenberg, 1927). Hence it could be said that the uncertainty idea being a consequence of fundamental quantum rules, has not played any heuristic role in the development of quantum mechanics. However this is not the case. The uncertainty principle is only a mathematical/quantitative comment on the qualitative postulate which enabled Heisenberg to discover the quantum rules (Heisenberg, 1925). In the following, I shall call it the uncertainty postulate or 'postulate of the phase nonlocalizability'. This postulate and the correspondence principle were just the magic wands which enabled Heisenberg to create the matrix mechanics— the first correct formulation of quantum theory. Now let us consider these postulates with special emphasis to their qualita- tive content and philosophical assumptions. (A) The Correspondence Principle According to this principle classical laws are asymptotic/approximate expan- sions of the quantum ones. There are two meanings of the classical limit: (a) the formal one based on the transition ft -*0 and (b) the physical asymptotics of large quantum numbers. It depends on the kind of problem, which of these two asymptotics is to be used. The oldest formulation of the correspondence principle, which is due to Bohr, concerned only the theory of atomic radiation. Bohr formulated some methodological guiding hints which enabled him to get some qualitative and even quantitative results concerning intensities and polarizations of spectral lines. The selection rules, for example, have been formulated in such a way. The semi-classical theory of atomic spectra based on the correspondence principle was a hybrid of the classical theory of elec- tromagnetic radiation and the Bohr-Sommerf eld quantum conditions imposed upon a classical multiply-periodic system. Hence it joined in a mysterious magic way two contradictory pictures of radiation: the Bohr-Sommerf eld quantum jumps and the classical continuous Hertz-Maxwell radiation. The very idea of the correspondence principle was based on the philosophi- cal belief in the existence of some rigorous quantum theory of atomic phenomena which had remained undiscovered up to 1925. This belief moti- vated all efforts at guessing this unknown theory by an appropriate reformulat- ing of its asymptotic form, i.e. classical mechanics and electrodynamics (Born, 1925). (B) The Uncertainty Postulate This postulate was the basic idea of the epoch-making paper of Heisenberg (1925). According to it, microscopic phenomena are essentially non-local in both configuration and phase space. Concepts such as a trajectory and a Siawianowski 149 hodograph of an electron become essentially inadequate within the framework of atomic phenomena. They are incompatible with the Heisenberg matrix representation of physical quantities. Therefore, the resulting quantum theory breaks with classical ideas even on the elementary level of kinematics. Before Heisenberg's discovery it was expected rather that the classical dynamics (the equations of motion) was to be replaced by some quantum theory, but nobody supposed the classical notions of state, position, etc., to be essentially inade- quate. The uncertainty principle derived by Heisenberg in 1927 is a mathematical comment to his uncertainty postulate. It describes in quantitative terms the phase space non-localness of quantum phenomena. The modern general formulation of this principle predicts the relationship between statistical dispersions of arbitrary quantities when measured in the same quantum state. When AA, AB are dispersions of quantities described by operators A, B on the quantum state p then AAAB>-\([A,B]) P \ In particular for positions and momenta Aq'Apj^-S'j (1) (2) Therefore quantum phenomena are non-local in a classical phase space. The critical phase volume characteristic for this non-localness is of the order (ft/2)" where n is a number of degrees of freedom. Within the finished modern framework of the quantum theory the uncer- tainty and the correspondence principles seem to be rather secondary results of the basic assumptions and automatically following from them in a purely logical way. In spite of such views we are going to show that there are still some doubtful questions and physical problems of interpretation connected with these principles. The correspondence of some classical and quantum concepts is a delicate matter. This concerns for example pure states. It appears that the purely logical formal approach based on the ft.-* asymptotics is not sufficient. Our analysis leads to interesting physical consequences concerning the rela- tionship between concepts of information and symmetry on the classical and the quantum level (SJawianowski, 1973). We start with the derivation of the Weyl-Wigner-Moyal phase-space formulation of quantum mechanics directly from the uncertainty postulate. 2. THE WEYL-MOYAL QUANTIZATION (if Heisenberg Had Started with Statistical Mechanics) The uncertainty principle does not preclude us from describing quantum phenomena in terms of classical phase space. It implies only that such a description shall be essentially non-local. 150 Uncertainty Principle and Foundations of Quantum Mechanics The phase-space formulation of quantum mechanics is due to Weyl (1928, 1931), Wigner (1932) and Moyal (1949). In particular, Wigner functions describing quantum statistical ensembles have long been used in physical chemistry. The phase-space methods as developed by Wigner seem to be a secondary accidental consequence of the usual Hilbert space approach. Weyl and Moyal developed more systematic formulations in which the classical phase space and its geometry played an essential role prior to the Hilbert space techniques. Nevertheless the basic ideas, motives and techniques of Moyal and those of Weyl were completely different. Moyal aimed at replacing an abstract geometry of Hilbert spaces by more familar statistical methods in a classical phase space. On the other hand the group-theoretical Weyl approach is based completely on the mathematical a priori. The formulation we present here joins and unifies the approaches of Weyl and Moyal. Starting with the uncertainty principle as a primary idea, we suggest reformulating the classical statistical mechanics so as to turn it into non- local theory. It appears that the Weyl-Wigner-Moyal formulation of quantum mechanics is the most natural result. To get it, we will appeal to some group-theoretical results of Bargmann (1954). Let us note that starting with the uncertainty relations as a primary basis for the 'deductive' construction of quantum mechanics is historically justified and free from tautology. In fact the famous derivation of this principle based on the idea of the Heisenberg microscope appealed explicitly to the semi-classical model of interaction between electromagnetic field and electrons. Hence, it was certainly possible to achieve this result before 1925, when the matrix mechanics was formulated. Unfortunately, the Weyl-Wigner-Moyal approach is applicable only to systems with affine symmetry of degrees of freedom: An affine phase space is a triplet (F, II, T) where: (1). (F, II) is an affine space; F is its underlying set, i.e. a manifold of classical states and II is a linear space of translations (free vectors) on P. (2). T is a covariant skew-symmetric and non-degenerate tensor of the second order on II: T e A II*. The only translation carrying a e P over onto beP will be denoted as ab e II. We will use only affine coordinates on F: when (a ; . . . e t . . .) is an affine frame, i.e. a eP and {«,-} is a linear basis hi II, then the corresponding affine coordi- nates of b e P are components of ab with respect to the basis {c,} a6 = ?(.b)e t The reciprocal contravariant tensor of T will be denoted as f . Raising and lowering of indices is to be understood in the sense of skew-symmetric 'metric' T. Instead of f ab , we will write shortly r afc this does not lead to misunderstand- ings IT,* = St r gives rise to the skew-symmetric scalar product on II and to the Poisson bracket operation. Stawianowski 151 The Poisson bracket of smooth functions F, G on P is given by {F,G} = <dF®dG,f) = r*^^ (3) where tj a are affine coordinates on F. The algebraic non-singularity of T implies that dim F = 2n, where n is a natural number, the so-called number of degrees of freedom. One can choose the affine frame in F in such a way that IF*I= o -/ (4) where / is an n x n -identity matrix (Kronecker matrix), is a matrix with all elements vanishing and Y ab are components of T with respect to the choosen frame. Denoting the corresponding affine coordinates as (. . . q ....... p, ,. . .) we find r „ „, dF dG dF dG {F,G} = — — t-— j — dpi dq dq dpi (5) Such coordinates are called canonical ones. Affine structure gives rise to the translationally-invariant Lebesgue measure on F. It is unique up to normalization. One can normalize it in such a way that the integration consists in /-*J/dn= J/d? 1 . . . dq" dp, . . . dp n (6) for any system of canonical coordinates. The above definition is correct because it does not depend on the particular choice of canonical coordinates. Such a normalization is inconvenient from the point of view of statistical mechanics, because the measure has a physical dimension of the nth power of action. So there is no unique definition of the entropy of statistical ensembles until some unit of action A is chosen. Such a choice enables us to get rid of the physical dimension in the volume measure on F. Namely we put this measure fix as follows \f(p) dMA (p) = A "J/ dq 1 . . . dq n d Pl . . . dp n (7) where (q . . . p„) are arbitrary canonical coordinates on F. The entropy S k (p) of the probabilistic measure on F, being absolutely continuous with respect to p. A and having density p, is given as S\(p)= IplnpdAix (8) Roughly speaking fi^(U) is the number of A" cells contained in the domain U<=P. 152 Uncertainy Principle and Foundations of Quantum Mechanics According to the Planck theory of black-body radiation we have at our disposal the quantum of action which is at the same time the universal constant of nature h = 6.54 x 10~ 27 erg sec = 2ttH Hence, we could put A = h. This is just the first place where physical quantum notions taken from experimental analysis are introduced into the geometrical framework of the classical phase space. To retain the correspondence with some commonly used formulas we will rather put A = h/2. Obviously, this does not alter any essential matter; such a choice only simplifies our notation. In the following we will often write p., S simply instead of p. h/2 , S h/2 . Hence our measure is given as follows: f / dp = (2//i)" j/ Aqi ■ ■ • V d/h • • • dp„ (9) Besides the above entropy argument there is one more reason justifying (7) with A = h/2. In fact, according to the uncertainty relations any essentially 2n -dimensional physical situation in P must be smeared out onto a region in P the n volume of which exceeds (h/2) n . States of quantum systems are related to phase-space regions the (i h /2 volumes of which are of order one at least. Therefore roughly speaking, p/,/ 2 (t/) gives an account of the number of quantum states within the phase-space region f/cP. Similarly, one can reasonably expect that quantum statistics attaches well-defined probabilities to those subsets only which are large when compared with critical h" cells (which 'contain' many quantum states). The uncertainty Principle suggests that we reformulate the classical statistical mechanics so as to turn it into non-local theory. Let us start with the basic notions of classical statistics in P: Physical quantities i.e. random variables are described by real analytic functions on P. Statistical ensembles are described by probabilistic measures on P. We are especially interested in measures absolutely continuous with respect to p h / 2 ; they are described by positive statistical densities normalized in such a way that J P dph/2 Obviously p* = p pA*Ad/t>0 for arbitrary A (9a) (10) In the following even when talking about measures concentrated on subsets of H -measure zero we will describe them shortly by 'densities' p keeping only in mind that they are then distributions rather than usual functions. Stawianowski 153 The operational statistical interpretation of the above notions is based on the following concepts: (1). Expectation value of a physical quantity A in the ensemble p: (A) p = jA(p)p(p) dp h/2 (p) (ID (2). The non-normalized probability of detecting a system in a statistical state p 2 when it is known to be in a state pi. P(Pi, P2) = J Pi(p)P2(p) dfi h/2 (p) (12) (3). 'Proper -ensemble' of physical quantity A with an eigenvalue aeA(P). Measurements of A on this ensemble give the result a with certainty; there is no statistical spread. Therefore, A becomes constant and equal to a when restricted to the support of p: (13) A |Supp p = a This is equivalent to the eigenequation: Ap = ap When A is a non-constant analytic function, then: p = FS(A-a) (14) (15) where F is non-negative on Supp p. Hence, 'proper ensembles' of A with an eigenvalue a are measures concentrated on the a -value surface of A: M (A , a) = {peP:A(p) = a} (16) When F=\, (15) becomes an (A, a)-microcanonical ensemble, i.e. statistical distribution 8(A-a). The only essentially local notion used in (1), (2) and (3) is the usual, pointwise product of functions (distributions). This is especially apparent in (3), i.e. in the classical spectral analysis of statistical ensembles and physical quantities. The lack of statistical dispersion of A measurements performed on the ensemble p is evidently a physical notion, prior to any particular mathematical model. It is only the structure of the associative function-algebra over P under the point- wise product, that is responsible for the essentially local relationship between proper ensembles of A and value surfaces of A. Therefore, the most natural way to achieve the non-local description of measurements in P compatible with the uncertainty principle is to replace the afore-mentioned associative algebra by some other function-algebra based on the non-localized product A±B. (A ±B)(p) = J3iT(p :p 1 ,P2)A( Pl )B(p 2 ) dpOh) dp(p 2 ) (17) 154 Uncertainty Principle and Foundations of Quantum Mechanics Such a product gives rise to the non-local statistical theory in P: one should only replace the pointwise product in (10), (11) and (12) and especially (14) by the above product (17). Now, we will find the appropriate form of 5if, starting with some natural postulates: We assume 1 to be translationally invariant: (A±B), = A W 1B, (18) for arbitrary functions A, B and 7reU;C„ denotes the function obtained by a 7r-translation of C: C A (q) = C(p) (19) Translational invariance implies %(P\P1,P2) = K(PPUPP2) for some function K. We assume the non-local product (17) to be associative: (AlB)lC = Al(BlO This results in the-functional equation for K [k{x u x)K(x 2 -x, x 3 -x) dx = j K(xi-x, x 2 -x)K(x, x 3 ) dx (20) (21) (22) (Stawianowski, 1974) where dx is a translationally-invariant Lebesgue meas- ure on II. The convolution-like structure of (22) suggests that we search for its solution in the Fourier representation. Our functional equation becomes then purely algebraic: K(tu €2 + &)*(&, &) = *(f i> 6)*(£i + f 2, 6) (23) (Stawianowski, 1974), where K: II* x II* -* C 1 is a Fourier transform of K: K(£, v) = \k(x, y) exp [-i«£ x) + <i,, y»] dy dx (24) Equation (23) is easily recognized to coincide with the functional equation for factors of projective representations of the abelian additive group II* (Barg- mann, 1954). Even without any appealing to the theory of projective represen- tations, it is easy to show (in elementary terms) that the only smooth and bounded solutions of (23) are given by: K(£, n) = exp m ® V, B)] = exp [iB^&Tfc] (25) where B e II ® II is an arbitrary real, contravariant tensor of the second order on II (Bargmann, 1954). The correspondence principle suggests that we put B proportional to f : B = bf , because of the fundamental role of T in the geometry of P. If we had chosen any other form of B, we would have broken the symplectic symmetry of the problem; there is no sufficient, non-arbitrary reason for any other choice of ; Stawianowski 155 B. One can easily show that when B = bt, then: K(x, y) = exp [^<r, x®y.>] = exp [o^r^-y*] (26) where o- is some real number depending on b, and using of h enables us to get rid of the physical dimension in the exponent, which is necessary if (26) is to be well-defined. It appears that the particular choice of a does not matter — associative algebras corresponding to various values of a are isomorphic with each other. To attain the correspondence with the currently used notations, we put a = 2. (Obviously, there is no physics nor mathematics in any such choice.) Finally: The non-local Weyl-Moyal product is defined as: (A ±B)(p) = Jexp \^(r,p~^®plp)JA{p 1 )B(p 2 ) d/t fc/2 (pi) dfi h/2 (p 2 ) (27) Let us quote the following properties of 1 : |aiS=[aB = <A*|B) (28) (A±B)* = B*±A* (29) 1±A=A±1 = A (30) moreover, the constant function 1 is the only function satisfying (30) for all A. A*±A#0 (31) unless A is a function vanishing almost everywhere. (C±A\B) = (A\C*LB) = (C\B±A*) (32) IfC±A=0 = A±£>forallA,thenC = £> = (33) Contrary to the pointwise product, the Weyl multiplication is non- commutative; its centre consists of constant functions only. Besides, let us notice the following asymptotic formulas which give account of the correspondence principle: lim A ±B = AB (34) lim -^(A ±B -B 1 A) = {A, B} h^ohi (35) provided A, B are smooth and A ±B is well-defined. Replacing in (10), (11), (12) and (14) the pointwise product AB by the Weyl product A ±B we get the non-local statistical theory in P. The property (28) implies that (11) and (12) do not change then at all. On the contrary (14) is replaced in a non-trivial 156 Uncertainty Principle and Foundations of Quantum Mechanics way by the following eigenequation: ALp = ap (36) where both a and p are unknown. We are looking only for probabilistic solutions of (36), i.e. such functions (distributions) p which satisfy the normali- zation condition and are positively semi-definite in the non-local sense: \p dfi = 1 (37) (p\A* 1A)= f p(A* LA) dp, = J p LA* LA dp^O (38) for all functions A. The subset of R composed of those values of a for which probalistic solutions of (36) exist, is in no direct way related to A(P). The relationship between solutions of (36) and value surfaces of A is essentially non-local. Let us notice that contrary to the classical semi-definiteness condi- tion: <p|A*A)sO, (38) does not imply that p >0. Quantum density functions are allowed to take negative values. They become positive in the usual, pointwise and local sense after coarse-graining over subsets of P which are large when compared with critical Heisenberg h n cells. The quantity *u=\ P dp. (39) can be approximately interpreted as a non-negative probability of localization in £/ <= P only when p.hn{U) » 1. Quantum spectral analysis and quantum measurements, being based on (36) are essentially non-local in P. The very structure of (27) implies that this non-localness is of the O-order h ", which agrees with the uncertainty principle. The non-local Weyl-Moyal statistical mechanics is obviously isomorphic with the usual Hilbert space formulation of quantum mechanics. The corre- sponding isomorphism, the so-called Weyl prescription attaches to functions on P, operators in an appropriate Hilbert space. Denoting the operator corre- sponding to A as A, we have: [Adp=TrA, aA+bB = aA + bB, ALB=AB, 'A* = A + , (A\B) = ^A*LB = ^A*B = Tr(A + B) The L 2 (R") can be used as a Hilbert space of wave functions. In fact, let (q\ p t ) be canonical affine coordinates on P and A(q', /?,) some function on P (obvi- ously A itself is a function on R 2n hence A(q',pi) is to be understood as a function on P resulting from superposing two mappings: (q\ p t ):P^R " and A : R 2n -»• C 1 . Do not take A (q l , p,) to be a value of A at a point of J? 2 "!). Then, denoting the corresponding operator by A and the kernel of its integral if representation as (x'|A|y'>: (AV)(x i ) = \(x i \A\y i )W)dny Stawianowski 157 (40) we find: ^^-(^fh^-^H^-")'* (41) (41) is just the famous Weyl prescription. A pure state corresponding to the wave function ^(x') is described in the Weyl-Moyal language by the following Wigner function (Moyal, 1949; Wigner, 1932): P = {£)"\** (<?' -y) expHVp,)^' +Yj d„r (42) Obviously (42) implies then: P-Lp = P (43) In contrast to the system of classical 'proper equations': Ajp = aip, i = 1 . . . m (44) the corresponding quantum systems of eigenequations: AiLp = aip, i = l...m (45) need not be compatible. The compatibility condition has a form: [A„ Aj\ = £t(A, LAj -Aj LA t ) = C&± (A k - a k ) (46) where C% are some functions (Dirac, 1964). One can show that p describes a pure state, i.e. (42) or equivalently (43) is satisfied if and only if it satisfies a maximal (impossible to non-trivial extending) compatible system (45). In more rigorous terms: let B denote the space of such functions A on P so that the corresponding operators A (via the Weyl prescription) are bounded. Let peB describe a quantum statistical ensemble: Jp = Trp = l, <p|A*±A>>0 Now, we introduce the following space of functions: E p = {AeB:ALp = 0} (47) Then, p is a pure state if and only if E p is a maximal left ideal in the associative (although non-commutative algebra (B, L). When functions (Ai-Oi) occur- 158 Uncertainty Principle and Foundations of Quantum Mechanics ring in (45) generate such an ideal, then any consistent system of eigenequa- I tions: (48) (49) Ai±x = aiX, F±x = is equivalent to (45), hence, there exist functions F t such that: F=lFy±(A,-a y ) For an arbitrary, not necessarily pure statistical ensemble p, a real subspace of E p composed of real-valued functions is a Lie algebra under the quantum Poisson bracket (i.e. it is closed under this operation). 3. CORRESPONDENCE PRINCIPLE AND WKB-ASYMPTOTIC EXPANSION (Information and Symmetry of Statistical Ensembles) There are two kinds of asymptotics describing the correspondence principle: 'large quantum numbers' and 'small values of the Planck constant'. They are supposed to be essentially equivalent, however, up to now there is no rigorous and general proof of this equivalence. Besides, in spite of some current views neither of these methods leads automatically to classical laws when starting with quantum theory. Some kind of physical intuition and 'feeling' is necessary to avoid mistakes; there are some dangers and traps typical in either of these descriptions. Roughly speaking, the asymptotics of large quantum numbers consist of the limit transition: An -» oo, n / An -» co (50) where n is a mean quantum number of a physical situation and An is a spread of quantum numbers (a width of a quantum state is n representation). In technical terms: n »An »1 (50a) Quantum formulae should approach then those derived from the classical laws. The typical danger of such a method is the neglect of the condition An » 1 . In particular, let {*&„} be the wave functions of stationary states of a bounded system (e.g. an atom). In general, it is not true for large values of n that ¥„ become quasiclassical throughout the whole configuration space. (By quasi- classical we mean here the following: interpretable in terms of geometric objects of the classical Hamilton-Jacobi theory.) However, when superposing such quantum states with coefficients slowly varying inside the interval (n - An/2, n +An/2) and vanishing outside of it, the non-classical terms of the various V„ approximately cancel each other, provided (50a) is satisfied. In such expressions the functions ¥„ can be replaced by their classical counterparts ^cijt built from the Hamilton-Jacobi objects and described by continuous c *. Siawianowski 159 'quantum numbers' k. Obviously summation over n is then to be replaced by integration over k. Basing the asymptotics on large quantum numbers is a convincing physical procedure. Its main idea is the cancelling of interference in situations described by rapidly oscillating wave functions (just those with large quantum numbers or short de Broglie waves). The asymptotics of 'small values' of the Planck constant is of a rather formal nature. The Planck constant ft is treated then as a free parameter of the theory. All fundamental mathematical expressions of quantum mechanics are analytic functions of ft on the positive real semi-axis: < ft < co. Quasiclassical analysis is based on the expansion of these expressions in asymptotic series about the dangerous point ft = 0. Passing over from finite to vanishing ft is connected with some qualitative discontinuities. As a rule, the afore-mentioned asymptotic series are divergent. According to the correspondence principle, their lowest order terms are expected to coincide with the appropriate classical expressions. The structure of asymptotics is essentially the same as that in the 'large quantum numbers' approach: when ft -* 0, quantum interference phenomena break down because we are then dealing with rapidly oscillating functions. In fact the basic quantum formulas involve expressions such as exp(iW/ft) where W does not depend on ft. Obviously they become rapidly oscillating when ft -> 0. This is just the idea of the method of stationary phase (Born and Wolf, 1964; Erdelyi, 1956). Technically, the ft -*• approach is much easier and more 'automatic' than that based on n -» oo. However, it is rather formal and one must not forget that the asymptotics of small ft are only the convenient conceptual shorthand of the physical asymptotics based on large quantum numbers. In fact, the transition ft-»0 transforms formally the whole conceptual structure of the quantum theory into a classical one, but it has nothing to do with the real laboratory conditions of 'quasiclassicality'. Physics is interested rather in answering the question 'what are quasiclassical situations in the real world, when the Planck constant has a fixed value?' Of course, it is only the n -> oo asymptotics that are able to answer this question. It shows that quantum laws both kinematical and dynamical, when applied to states compatible with (50) asymptotically degen- erate to classical laws. Conditions (50) justify even the possibility of an approximate description of a quantum system in terms of the classical notion of state. Let us notice that in situations described by (50), ft is small when compared with typical values of physical quantities, which are then of the order ftn , ft An. This is why the asymptotics ft ^ is justified, but only as a shorthand for 'large quantum numbers'. Typical dangers of the ft ^ asymptotics are as follows: Let {*„} again be the stationary states of a bounded system. As calculated by means of the ft- dependent Schrodinger equation, ^„ depend in addition on ft. Let us indicate this explicitly by using ¥(„,»■ I* would be meaningless to calculate lim/,-,0 ^(n,h) and expect any relationships with classical expressions; moreover, as a rule, 160 Uncertainty Principle and Foundations of Quantum Mechanics such a limit does not exist at all. The reason is that it is only the fixed quantum number n, or equivalent^ the number of nodes of ¥„, that describes the quantum structure of a state and decides to what extent %„, h) is essentially quantum or quasiclassical. The varying of ft in ¥ (n , h) when n is fixed, would be non-physical, because the scale, order of 'classicality' of ¥„ depends only on n. To be supported by an extreme example: we could consider a ground state corresponding to the smallest possible value of n. It is obvious that there is neither classical limit nor anything reasonable in putting ft ->0 in a wave function of a ground state because n is then fixed and small, and ft -> is only the theoretical shorthand for n -»oo. Similarly, it is meaningless to apply the ft -*• asymptotics when studying spinning particles, because then some quan- tum numbers (describing internal angular momentum) are small. In the following, we are using the methods of ft ->0 asymptotics, carefully comparing some results with the analysis of the order of quantum numbers. Of course, we remain within the Weyl-Wigner-Moyal framework. This formula- tion of quantum mechanics, being based on the geometry of classical phase spaces is especially convenient when studying quasiclassical phenomena and the correspondence principle. As we have mentioned in the previous section, the non-local Weyl-Moyal operations reduce in the limit ft ^0 to the local, pointwise operations. More strictly: Let A, B be smooth functions over P, for which the Weyl product A LB is well-defined (the integral (27) converges). Expanding (27) into an asymptotic series about ft = by means of the method of stationary phase (Born and Wolf, 1964; Erdelyi, 1956), we get: ALB=AB+^{A,B} + (51) Therefore: lim A IB = AB h-»0 1 lim f:(A IB -B -L A) = {A, B} (52) (52a) Hence the pointwise product and the classical Poisson bracket are classical limits of the Weyl product and the quantum Poisson bracket (Moyal bracket), respectively. On the quantum level, the structures of Lie algebra and associative algebra are directly, algebraically related to each other. In fact the quantum Poisson bracket [A, B]= l/fti(A LB-BLA) is algebraically built from the associa- tive product (via a commutator). It is not the case on the classical level: {A, B} involves differentials of A, B and fails to be an algebraic function of A, B. This is the main qualitative discontinuity of the classical limit ft -> 0. There are serious physical consequences of this fact. Namely, let us consider a left Stawianowski 161 eigenequation: ALp = ap (53) where A is an arbitrary physical quantity (A* = A, A is Hermitian) and p is a statistical ensemble: \p dp, = 1, \p{B* LB) dp, >0, for arbitrary B, p* = p. Obviously (53) is the Weyl-Moyal counterpart of the left operator eigenequ- ation for density operators: Ap = ap (53a) Taking the complex conjugate of (53) and subtracting it from (53), we get via (29): or, in operator terms: [A,p] = ^(A±p-p±A) = Ap-pA=0 (54) (54a) But (54) means that the statistical ensemble p is invariant under a one- parameter unitary group generated infinitesimally by A. Example: let A = L t n' be a component of the angular momentum along the axis given by the unit vector n. Now, let p be a statistical ensemble with the sharply defined value m of Vno Lin Lp = mp Then, p is invariant under the one-parameter group of rotations about the n axis. The records of the detector in a scattering experiment remain unaffected when the source producing particles in such an eigenstate p, is subject to rotations about the n axis. Therefore, the informative property (53) (no statistical spread when measur- ing A on p), implies the invariance property (54). This is not the case on the classical level, i.e. for vanishing ft, because the equation Ap = ap (55) possesses solutions for which the equation: {A,p} = is not satisfied. (56) Roughly speaking: On the quantum level (finite ft) information implies symmetry. On the classical level (vanishing ft) this relationship breaks down. Therefore, the purely formal methods of the ft -*■ asymptotics are unsatisfac- tory and insufficient when studying classical counterparts of eigenstates. In fact, on the quantum level, the eigenstates of A are uniquely defined by (53) or (53a). Therefore, the equation (52) suggests that we define classical eigen- values and eigensembles of A as those satisfying (55). However, from the 162 Uncertainty Principle and Foundations of Quantum Mechanics physical point of view, it is hard to accept such an approach, because the eigenstates would then lose their fundamental physical symmetries. The physi- cal qualitative understanding of the correspondence principle rather suggests that we should define classical eigenstates of A as those satisfying both (55) and (56) i.e. the pair Stawianowski 163 Ap = ap {A, P } = (57) where a is a regular value of A. In fact, from the physical laboratory point of view, the notions of information and symmetry are essentially independent of each other in spite of their accidental relationship within the framework of Hamiltonian quantum mechanics. They involve two different kinds of physical operations: (1). Measurements, mainly scattering experiments and statistical analysis of spreads of results, recorded by films or counters for example. (2). Transformations, motions performed on the experimental set-up, e.g. on sources and detectors. The main example being rigid translations and rotations. Therefore, to retain the physical structure of eigenstates, we have to define them on the classical level by means of (57). This is confirmed by the quasiclassical scattering experiments. When A is an analytic function, then the only probabilistic solutions of (57) are distributions: p = F8{A-a) where: {F,A} = (A-a)G for some function G. The equation (59) is equivalent to: {F,A}|M A , a -0 where: M iA , a) = {peP:A(p) = a} (58) (59) (59a) (59b) Remark: F in (58) need not be differentiate, we assume only that {F, A } does exist, i.e. there exists the derivative of F in the direction of the vector field dA which is related to the Pfaff form dA via the T-lowering of indices: (dA) a = r" , b dA (59c) In general, (57) or equivalently (59a) possesses many solutions. To restrict this arbitrariness it is necessary to perform additional measurements or in mathematical terms to add some similar conditions. Hence, let us consider the i — \...m (60) following system of classical eigenconditions: Atp = a t p {A„p} = where a = (a t . . . a m )eR m is a regular value of the mapping A = (A u ...,A m ):P^R m (this mapping transforms peP onto (A 1 (p)...A m (p))eR m ). Contrary to the system of proper equations: Aip = a t p, i = 1 . . . m (61) which is always compatible, the eigenconditions (60) need not be compatible and as a rule they are not. To be more precise we give the following definitions. A classical system of eigenconditions (60) is said to be compatible, when: (1). It is completely integrable in the sense of the theory of systems of differential equations; (2). It is regular in the sense that a = (a x . . . a m ) is a regular value of the analytic mapping A = (Ar...A m ):P-*R m One can easily show that the regular system (60) is compatible if and only if there exist functions CJJ such that: {A„ A y } = {A,- - a,, Aj -aj}= C^A k - a k ) (62) When A i . . . A m are functionally independent (which ensures the regularity of (60) then the only probabilistic solutions of (60) have the form: p = FS(A 1 -a 1 )...(A m -a m ) (63) where: {A i ,F} = FV v +G i l (A,-a i ) (64) for some functions G\. Classical compatibility conditions (62) are obvious counterparts of the corresponding quantum equations (46). They possess a geometric interpreta- tion which is interesting from the point of view of both physics and mathema- tics. In fact let M, (A,a) = {peP:Aj(p) = a h i = l...m} then: {A,, A ; }|M A>a = {A, - a„ Aj - a ; }|M A>a = (64a) (62a) Equations (62) and (62a) imply that M (A , a ) is what Dirac, Bergmann, Goldberg, and others have called I-class constraints (Bergmann, 1966; Bergmann, 1970; 164 Uncertainty Principle and Foundations of Quantum Mechanics Bergmann and Goldberg, 1955; Dirac, 1950; Dirac, 1951; Dirac, 1955 a,b; Dirac, 1964; Stawianowski, 1971; Stawianowski, 1975; Tulczyjew, 1968). Roughly speaking, the submanifold M<= P is said to be of first class if any vector T-orthogonal to M must be at the same time tangent to M. More strictly let p e M and let TpM denote the linear subspace tangent to M at p. Now let k e II be such a vector that for arbitrary u e TpM: <r,fc®u> = r afc fcV = (65) M is a first-class submanifold, if for arbitrary peM, (65) implies: k e TpM. Let V(M) denote the set of analytic functions vanishing on M for arbitrary subset M<=P: V(M) = {f€C(P):/|M = 0} (66) Obviously, V(M) is an ideal in the associative algebra C°(P), i.e. it is closed under pointwise multiplications by arbitrary analytic functions. What is inter- esting is that V(M) is at the same time a Lie algebra (i.e. it is closed under the Poisson bracket) if and only if M is a first-class submanifold. In particular let p be an arbitrary classical probability distribution and % p ={feC*(P):fp = 0} (67) [compare with (47)]. Obviously, % P = V(Supp). For arbitrary p, % is an associative ideal. It is a Lie subalgebra if and only if the support of p, Supp p, is a I-class submanifold, i.e. if p satisfies some compatible system of classical eigenconditions. Let us remember the corresponding property of the quantum space E p . The analogy is obvious. In the following, the ideals V(M) corresponding to the 1-class manifolds M will be called self -consistent. Any such ideal has a form g p = V(Supp p) where p satisfies some compatible system (60). Probability distributions p such that % is not self-consistent (Suppp fails to be a 1-class submanifold) must not be looked on as classical counterparts of quantum states. They are non- interpretable from the point of view of the correspondence principle, because they violate the uncertainty principle on the quasiclassical level. A typical example of such a 'wrong' distribution is: p=FS(q 1 -a 1 )8( Pl -b 1 ) (68) Both q 1 and its conjugate momentum p t are simultaneously dispersion-free on such a p, hence Suppp fails to be of 1-class because {q\p 1 }=l # and p is non-interpretable in terms of the quasiclassical theory. On the quantum level special attention is paid to the pure states which carry the maximal possible information and could be defined as those satisfying a maximal system of compatible eigenequations (45), i.e. answering a maximal number of operational 'questions'. The corresponding left ideal E p [cf . expres- sion (47)] then becomes maximal. The question arises as to the classical counterparts. The correspondence principle suggests that we look for classical Stawianowski 165 distributions p for which % p is a maximal self -consistent ideal. Roughly speaking, such a distribution p satisfies some maximal system of compatible eigenconditions (60): 'maximal' means here that any extended system of eigenconditions is consistent only if Atx = a t x Fx = {A h x} = {F,x} = F = Y J {A i -a j )F i (60a) (60b) for some functions F t . Hence (60a) is equivalent then to (60). Obviously, ? p is a maximal self -consistent ideal of the type V(M) if and only if Supp p is a minimal closed analytic submanifold of I-class in P. The lowest possible dimension of such a manifold equals the number of degrees of freedom: n = 1/2 dim P. The I-class manifold becomes then what mathemati- cians call the lagrangian manifold i.e. maximal T-self -orthogonal manifold. In rigorous terms: M <= P is said to be an isotropic manifold if any vector tangent to it is at the same time T-orthogonal to M that is if for an arbitrary pair of vectors tangent to M at p : u, v e TpM <= II ; the following orthogonality equations hold : (r,u®v) = T ab u a v b (69) One can easily show that dim M^n. An isotropic manifold is said to be lagrangian if there are no vectors transversal to it and T-orthogonal to it at the same time. Obviously an isotropic manifold is lagrangian if and only if its dimension equals n (Weinstein, 1971; Weinstein, 1973; Stawianowski, 1971; Arnold, 1974). When Supp p is a closed and connected lagrangian manifold, then obvi- ously % p = V(Suppp) is a maximal self -consistent ideal in C°{P) and p is a classical counterpart of a pure quantum state. Typical and suggestive examples are as follows: p = 8(q 1 -a 1 )...8(q n -a n ) (70a) P = 8(p l -b 1 )...8(p n -b n ) (70b) (where a and b are constants), or, in a six-dimensional phase space P: p = 8(L i n i -m) 8(L 2 -l 2 ) 8{\p 2 + V(r)-E) (71) where T k j T 2 v T 2 „2 _ V « 2 Li = £ijq'p k , L =LLj, p -LPi, 2 v ' 2 r =Io n is a three-dimensional unit vector, and m, I, E are constants. Obviously equations (70) describe classical statistical ensembles with sharply defined values of q\ p„ respectively (on the contrary p, and q' then become completely undetermined according to the classical uncertainty relations). Expression (71) 166 Uncertainty Principle and Foundations of Quantum Mechanics is an ensemble with denned values of the nth component of the angular momentum, the square of the angular momentum and energy. Hence, (71) corresponds to the classical partial wave analysis. Distributions (70) and (71) are especially convenient when analysing classical scattering experiments. They are directly related to the Hamilton- Jacobi theory. Moreover, it appears that properties of classical probability distributions concentrated on langran- gian submanifolds are interpretable in terms of the classical Hamilton- Jacobi theory. There is nothing surprising in this: such distributions are classical counterparts of pure quantum states which admit the description in terms of wave functions. But it is known even from elementary textbooks that the h->0 asymptotics of the Schrodinger equation imposed on the wave function ¥ = VDexp(iS/fi), leads to the classical Hamilton-Jacobi equation imposed on the phase S and to the continuity equation for probabilistic fluids (with velocity-field given by the gradient of 5) (Landau and Lifshitz, 1958; Messiah, 1965). Let us investigate these problems in some details. We start with interpreting some of our results in terms of wave functions. Let us consider a pure quantum state, the wave function of which is V(q' ) (q, Pi are canonical affine coordinates on P). The corresponding Wigner function p is given by (42) P = (£)"\V* [q l ~] e*PHVW* («' + f) d »- < 42 > Let us put: V = JDexp(-S\ (72) We will write p[D, S] to indicate explicitly the functional dependence of p on D, S. D is assumed to be continuous and S twice differentiable. We aim to find the classical limit of p[D, S] expressed in terms of D and S. In the lowest order of WKB-approximation, D and S are assumed to be independent of h (Landau and Lifshitz, 1958; Messiah, 1965). Moreover, the h -independence of the physical interpretation of D and 5 is obvious even without any use of the WKB-approximation. In fact D is a probability distribution for positions and S is related to a spread of momentum: <¥|<3'|<P> = Jd^ 1 . . . q")q' dq 1 ... dq" (73) W\P t \V> = \D^ ...q") dMq 1 ...q n )dq l ... dq" (74) provided \P is well-behaved at infinity (Mackey, 1963). These formulas do not involve the Planck constant explicitly. Let M s be a maximal submanifold of P on which all functions pj-diS(q') vanish: [pi-diSiq'nMs^O (75) Siawianowski 167 It is obvious that M s is a lagrangian submanifold: {p, - dMq"), Pi ~ djS(q k )} = (dfj- $S(q k ) = Now, let us introduce the following probability distribution p ci [D, S] on P, concentrated on M s : Pcl [D, S] = mqf 8ip 1 -d 1 Siq k )) • • • 8(p n -d n S(q k )) (76) Obviously, p cl is a classical probability distribution, non-negative in the usual local sense. Nevertheless as far as we are interested in analysing measurements of first-order polynomials of canonical affine coordinates, p c i[D, S] is essen- tially equivalent to p on the rigorous quantum level (finite «): = ^(a i q i +p i p i )p[D,S] (74a) Of course for higher order polynomials this formula breaks down (excepting polynomials depending on q' only). However, in the classical limit p becomes exactly equivalent to p cI . In fact, making use of the afore-mentioned indepen- dence of D and S on h in the lowest order of WKB-expansion we find lim h ^o pW,S] = Pci [D,S] (77) Obviously, the limit is to be understood in the sense of generalized functions (Schwartz, 1950-1951). Now, let H be an arbitrary analytic function on R 2n and H, the operator corresponding to Hiq 1 . . . q",pi . ..p n ):P^R in the sense of the Weyl pre- scription [cf. (40) and (41)]. We assume H to be bounded in L 2 (R"). Let us consider the eigenequation fiV(q) = EV(q) (78) where VeL 2 (R n ). We put ^ = VDexp(iS/n). Equation (78) implies that H(q, p)±p[D, S] = Ep[D, S] (79) H{q, p)±p[D, S]-p[D, S]±H(q,p) = (80) In the classical limit, these equations become Hiq, p) Pc lD, S] = Ep cl [D, S] (81) {Hiq,p),p c W,S]} = (82) Expression (81) implies that Hi...q i ...,...d i S(q k )--.)^E (83) which is none other than the time-independent Hamilton-Jacobi equation imposed on S. In geometric terms it means that Hiq,p)\M s ^E (84) 168 Uncertainty Principle and Foundations of Quantum Mechanics /7-CXIS A^O dS p with statical weight D(q) = ^*(q)^(g) <7-axis Figure 1 Wave functions and phase-space distributions. Systems of dots represent an exact quantum Wigner function p[V] = p[D, S]. The surface M s given by equations Pi = dS/dq' is a support of the classical probability distribution: / dS\ I dS\ PcM = PclD, S] = D(q) S[p i -— 1 ) ■ ■ ■ S(p n -—) Both p[D, S], pA.D, 5] lead to identical expectation values for canonical affine coordi- nates in the phase space. When h->0,p approaches p cl and the above system of dots shrinks to M s . Restricting the canonical vector field AH to M s (this is possible, because &H is tangent to M s ) and projecting it to the configuration space (the quotient space of P with respect to fibres on which all q' are constant) we get the following S-velocity field V[S] V[SJ = d n+i H(. ..<?'...,... djS(q k ) . . .) (85) For example when then: and (82) is equivalent to: H—gWi+fil 1 ) VlSy^-g'djSiq) m (85a) SB D(q) = ■V[S] (86) Siawianowski 169 where <e nS }D(q) denotes the Lie derivative of D(q) with respect to V[S]. Expression (86) means that D(q) is invariant during the motion, hence, it is a continuity equation for quasiclassical stationary states. As a geometric object, D(q) is a density of weight one, hence (86) is a shorthand for V[SH diD(q)+D(q)^f-=0 dq Finally the quasiclassical counterpart of (78) is a couple of conditions: H(...q l . .dMq k )...) = E nsUiDiq^ + Diq") 9 -^- dq = (86a) (87a) (87b) Conditions (87) could be derived directly from (78) after substitution of (40), (41) and (72) and expanding it up to the first asymptotic order in h. One should only make use of the method of stationary phase; conditions (87a) and (87b) result then as a real and imaginary part of the asymptotic formula respectively (Tulczyjew, 1968). A system of equations (87) is completely integrable if and only if the conditions (62) are satisfied: {H a ,H b } = C ab c (H c -E c ) (62a) The maximum possible number of independent and compatible simultaneous eigenconditions (87) imposed on the same pair D, S equals the number of degrees of freedom n. Any system of n independent compatible eigencondi- tions (87) possesses a solution which is unique up to an additive constant in S and a multiplicative one in D. It is essentially equal to the quasiclassical solution found by Van Vleck (1928) and proved by him to coincide with the asymptotic WKB solution. In more detail let A,:i? 2 "^i?, i = l...n be analytic functions such that the corresponding phase space functions A t (q,p) are in the involution {A i (q,p),A i (q,p)}=0 (88) Let us consider the corresponding compatible system of eigenconditions imposed on the quasiclassical wave function V D exp(iS/h): A,(..y...,...aySte k )...)«<ii (89a) V,[SH djD(q k )+D(q k ) djVtiSl/ = (89b) The autonomous subsystem (89a) imposed on 5 possesses the unique (up to an additive constant) solution which depends in a parametric way on constants a t . Let us insert this dependence explicitly into S by introducing a function S:R 2n ^R such that S(q', a') is a solution of (89a) for arbitrary values of a ; . Obviously, S(q',a l ) is a complete integral for any of the Hamilton-Jacobi equations (89a). Substituting S(q\ a') into (89b) we get an autonomous system of first-order differential equations imposed on D. It is completely integrable 170 Uncertainty Principle and Foundations of Quantum Mechanics and one can show that the only solution (up to normalization) of this system is given by the Van Vleck determinant: D(q',a l ) = Det d 2 S dq da'W (90) In particular, S can be chosen to be an arbitrary complete integral of the stationary Hamilton-Jacobi equation H(...q' .8jS(q k ,a k )...)^E (91) where H is a Hamiltonian function of the mechanical system. The geometric interpretation of the Van Vleck solution in terms of symplec- tic geometry has been given in Stawianowski (1972). To some extent, the Van Vleck object can be guessed a priori, in terms of pure geometry, without any appeal to the quasiclassical approximation. Obviously the quasiclassical Wigner-Moyal density in P corresponding to the Van Vleck solution of (89a) is given by p ci [D, S] = 5(Aj - a x ) . . . S(A n - a„) ,2 r ii J 5(Pi ~diS(q, a))... 8(p k -d k S) = Det d 2 S \dq da' (92) Remark: When p is a classical probability distribution the support of which is a closed connected lagrangian submanifold then % p is a maximal self -consistent ideal in C(P) and consequently p is a quasiclassical pure state. Contrary to what might be expected the converse statement is false. In fact let H : P -» R be such a Hamiltonian that the corresponding dynamical system does not possess any non-trivial constants of motion (excepting the Hamiltonian itself, of course). This is the case for example when the system is ergodic (any classical trajectory is dense on some value-surface of H-. The quotient sets of value- surfaces of H with respect to the congruence of classical trajectories (integral curves of AH) fail to be differential manifolds in any natural way. Then it is easy to see that the only solution up to a constant factor of the classical eigencondi- tion Hp = Ep {H,p} = is the microcanonical ensemble p^8{H-E) <&„ is a maximal self -consistent ideal in C°(P), although Suppp fails to be a Lagrangian submanifold (it is only a 1 -class submanifold of P). We have shown above in what way and in what sense the asymptotics h ■* lead to the Hamilton-Jacobi theory of classical mechanics. The crucial point was that the purely formal asymptotic rules were inadequate to get physically reasonable and satisfactory results. They had to be completed by qualitative physical ideas such as information and symmetry properties. The necessity for Slawianowski 171 such 'supplements' is a typical feature of the formal h -» approach, in contrast to the more physical although less elegant asymptotics of large quantum numbers. The physical analysis based on information and symmetry properties of statistical ensembles leads to results which do not agree with some current views concerning pure states. It is well-known that pure states carry the maximal information about the system. Hence it is reasonable to conjecture their classical counterparts to be distributions or measures the supports of which degenerate to the subsets of phase measure zero. One commonly believes that they are Dirac measures concentrated on single points in a phase space. Hence within the framework of such, currently used analogy classical pure states and points of the phase space are essentially identical notions. In spite of these views but in agreement with the old Hamiltonian ideas of Synge and Dirac (Dirac, 1964; Synge, 1953; Synge, 1954) we have shown that quasiclassical probability distributions corresponding to pure states are con- centrated on lagrangian submanifolds the dimension of which equals the number of degrees of freedom. Hence, to any quasiclassical pure state there is attached some set of the usual, classical states — a lagrangian manifold in a phase space. Each of these classical states is taken with its own statistical weight. Hence classical states, that is points in a phase space, are 'hidden parameters' of quasiclassical ones. We have presented both physical and a priori geometrical arguments in support of our views. Let us notice that our approach agrees nicely with the Bohr-Sommerfeld conditions of the old quantum theory. In fact the objects upon which these conditions are imposed are none other than lagrangian manifolds in a phase space. Let (. . . q' ....... pi . . .) be canonical affine coordinates on P; we introduce the Pfaff form o> a>= Pi dq' (93) Obviously in arbitrary affine coordinates Now, let M a be a Lagrangian submanifold where 'I* ={p eP: A, ; (p) = Oh i = l...n} {A„A,} = (94) (95) The Bohr-Sommerfeld conditions imposed on Ji a mean that the line integral of w over any closed curve in M a , equals the Planck constant multiplied by some integer characterizing a loop of integration. When M a are topologically equiva- lent to tori which is the case in the theory of multiply periodic systems, these conditions give rise to non-trivial restrictions on the admissible quantized (in the Bohr-Sommerfeld sense) values of physical quantities A x . . . A n . The currently used analogy between pure quantum states and points of the phase space is based on the properties of Gaussian-shape Wigner functions. In 172 Uncertainty Principle and Foundations of Quantum Mechanics spite of its formal correctness this argument is physically wrong. Let us investigate this in some detail. Let (q\ p,) be canonical affine coordinates and H laJ>) = \l(q'-a') 2 +ll(Pi-b t ) 2 (96) Obviously, H iayb) is a Hamiltonian of the harmonic oscillator the equilibrium of which is given by the point in P with coordinates (a, b t ). The corresponding Gaussian-Wigner function is given by E (a ,b) = (rrh) " exp |^ --H (a , 6) J (97) One can easily show that E (a , b) is a Wigner function describing some pure state (2Trh)- n ^E (a , b) dq 1 ...dp n = l (98) E( a ,b) ~ E( a ,b) E(a,b)- i -E( atb ) = E( a ,b) ^E (aJ>) (A*±A)dq l ...dp n >0 (101) (99) (100) (As a- matter of fact, E (a , b) is positive for arbitrary A in the local sense too: £ W) >0). One can easily show that lim Et ab) = 8(q 1 -a l ) . ..(p„-b„) h-0 (102) This could be understood in such a way that at least some pure states possess classical counterparts concentrated on single points (point measures). However this is only a typical example of mistakes arising when no care is taken with regard to the scale of quantum numbers. In fact E {a , b ) describes the physical situation with small quantum numbers. To see this let us notice that it is essentially concentrated in the region the phase volume of which is of the order h". Moreover, E iaJ>) is none other than the ground state of the harmonic oscillator: -ff(a,i.)-L-E(o,6) - inhE( a ,b) (103) where n is the number of degrees of freedom. Therefore according to what we have mentioned above the h -» asymptotics of E ia _ b ) is physically meaningless. We finish this chapter with some remarks concerning quasiclassical problems of the superposition principle and interference of amplitudes. Let us consider the wave mechanics in R". An arbitrary wave function ¥ = a exp(iS/h) gives rise to some geometric figure in R n+1 , namely the graph of its phase: Pm = P[<r, S] = {(x, S(x)):x eR n } (104) Slawianowski 173 Now let us consider an arbitrary m -parameter family of wave functions {* a =a- a exp(^S a ), aeR m } and the corresponding system of phase diagrams PWa\ = PWa,Sa\ Let <p = fi exp(it/h) be some complex amplitude on R m . It gives rise to the continuous superposition ¥ = a exp^-S = Ufa)^ da 1 ... da" (105) which can be represented by its phase graph P\^f\ As usual in the lowest order of the WKB approximation we assume a a , S a , fi, t not be dependent on the Planck constant. Let °^ = °a exp(i°S/h) denote the lowest order term of the asymptotic expansion of ^ about h = renormalized so as to retain the probabilistic interpretation in the classical limit: (V|>)=f V¥ = The corresponding phase diagram will be denoted as P\°"V] = P\°cr, °S]. Making use of the method of stationary phase when calculating (105) we get the result that essentially P[ ^] is an envelope of the family {P[V a . 4>(a)} = {PUt(a) . a a , t{a) + S a \} This is just the quasiclassical Huyghens interference rule. More rigorously exp - i sign d 2 (S ( .)(x) + <f> (•))o (x)J V/i hes aoU) (S(.)(x) + <£(•)) = Nn (a (x))o-a„(x)(*) exp ^-S^Cs) + <t> (a (x)) J exprJTri sign d 2 (S ( .)(x) + ^(OWq] Vfc"hes fl0(l) (S(.,(x) + ^(-)) where a (x) is the solution of equations: (106) - 7 [s ( . ) (x)+<M-)]=o da' (107) N is an h -dependent normalizing factor ensuring that ^ retains its probabilis- tic interpretation and d 2 f y is the matrix of the second-order derivatives off at y, hes y / is the determinant of this matrix (hessian of / at y) and sign 8 f y is a signature of the symmetric matrix d 2 f y (i.e. a difference between the number of positive and negative eigenvalues). ! 174 Uncertainty Principle and Foundations of Quantum Mechanics When the equation (107) possesses more than one solution, then, (106) becomes the sum of terms corresponding to all solutions a (x). The Huyghens-Fresnel interference rule (107) gives rise to the classical Huyghens superposition principle. In fact when all S a (q') are solutions of the same Hamilton-Jacobi equation (83) then °S (<?') corresponding to the envelope of {P[V a • <t>(a)] = P[p-{a) • <r a , t(a) + S a ]} is a solution of this equation too (Caratheodory, 1956). The envelopewise superposition principle for the classical Hamiltonian-Jacobi equation appears to be the correspondence principle counterpart of the usual linear superposi- tion principle for the Schrodinger equation (Stawianowski, 1971; Stawianowski, 1975). In particular one can solve initial (boundary) problems for the Hamilton-Jacobi equation by the envelopewise superposition of classi- Figure 2 The classical Huyghens superposition rule. Continuous superposition of wave functions ¥„ = a a exp(tf a /ft) with coefficients <p(a) = M«)exp[<f(a)/ft], V- o-exp(iS//») = jV(a)<kda. When h^O, the diagram of S becomes an envelope of diagrams of {S a + t(a)} (i.e. the diagram of °S). Stawianowski 175 cal 'propagators' (Stawianowski, 1971; Stawianowski, 1975) with initial boundary conditions as 'coefficients' of 'superposition' (Stawianowski, 1975). 4. ALGEBRAIC AND PHYSICAL PROPERTIES OF EIGENSTATES AND PURE STATES. QUASICLASSICAL COMPATIBILITY In the previous section we have derived some results concerning the correspon- dence principle for mechanical systems with affine symmetry of degrees of freedom. The Weyl-Wigner-Moyal formulation enabled us to achieve this in an almost deductive way. Unfortunately, this method does not work for general mechanical systems. This is why we have paid special attention to structures which did not depend explicitly on the particular properties of the space of states. Now starting with these structures we present some general statements concerning the analogy and correspondence between classical and quantum theory, with special attention to the quasiclassical compatibility problem. The geometric guidance from the theory of mechanical systems in affine spaces suggests some general analogies between classical and quantum concepts without explicit calculations and asymptotic expansions. To justify these analogies strictly one should appeal to the correspondence principle formu- lated in terms of large quantum numbers. However we will not do this here because we believe that the general information and symmetry properties of statistical ensembles do not depend on the particular structure of the manifold of states. We start with the classical probability calculus of discrete sets. Let / be a countable set of elementary events. No additional structure in / is supposed; it is only a set. Let C(I) denote the linear space of complex-valued functions over /. Obviously, C(I) is an associative algebra under the local, pointwise product. Statistical ensembles are described by probability distributions on I, i.e. real and normalized functions p on J: pf^pd), P (i)^o, E P (0=i i Physical quantities, i.e. random variables are described by real functions on I, A: A*(i) = A(i). Expectation values are given by the usual formula: {A) p =lA(i)p(i) i Now let % p be a linear subspace of C(I) composed of all random variables which vanish on the support of p % p ={FeC(I):F P = 0} (108) Obviously % p is an ideal in the associative algebra C(I) — it is closed under pointwise multiplications by all elements of C(I). All random variables in % p 176 Uncertainty Principle and Foundations of Quantum Mechanics are free from statistical dispersion on the p-ensemble; all measurements give a sharply defined result namely zero. In particular let Fe % p and F-{A a) where a is some constant. Then A has a sharply defined value a eA{I) on p and the following eigenequation is satisfied: Ap = ap (109) The greater g p is, the more information is carried by p: % pi <= % n implies: £ Pl (i) In pi(i) < I p 2 (') In P2O) (110) Roughly speaking by imposing additional eigenequations (109) upon p we increase its informative content, because the number of measurements with completely predictable results is then increased: Now, let us assume that % p is a maximal non-trivial ideal. Then, p answers the maximal number of non-trivial questions, i.e. the maximal number of measurements has unique, certain results (the maximal number of eigenconditions (109) is satisfied). Such a p is a pure state of the classical probability calculus. It is obvious that for an arbitrary pure state p there exists such a point ie/ that: P (j) = 8 ij , i.e. p is a point measure and: pp = p (HI) Information (entropy) takes then its maximal (minimal) i.e. vanishing value: Ip(01np(0 = (112) All physical quantities A have then sharply defined values A (i) and % p consists of functions vanishing at i e J. Then % p is maximal if and only if all random variables are dispersion-free on p. This is just the main peculiarity of the classical probability calculus. Now let us turn to quantum statistics. The state space of a quantum system is a separable Hilbert space H. Let B(H) denote the associative but non- commutative algebra of linear bounded operators in H. Physical quantities are described by Hermitian elements of H: A = A. Statistical ensembles are described by density operators p i.e. Hermitian, positively definite and trace-class elements of B(H): p + = p, Trp = l, Tr(pA + A)>0foranyA An expectation value of A e B(H) on p is given as: <A> p = Tr(Ap) Similarly, as in the classical probability calculus we define: E p = {FeB(H):Fp = 0} (113) (114) (115) Obviously, E p is a left ideal in the associative algebra B(H); it is closed under all transformations: F^GF. Algebraic and physical interpretation of E p is Stawianowski 177 analogous to some extent to that of % p in classical statistics. In fact, E p describes all measurements which, performed on the statistical ensemble p give sharp results without any statistical dispersion. Hermitian elements of E p describe physical quantities which take a definite value, namely zero, when measured on p. When AeB (H) is a physical quantity (A + = A) which takes a value a e Sp A on the ensemble p without any statistical spread, i.e.: then obviously Ap = pA — ap F=(A-aI)eE p (116) The greater E p is the more experimental questions are uniquely (spread-free) answered by p. Similarly as in the classical case E PI c inequality for information (entropy): Tr(px In pi) < Tr(p 2 In p 2 ) - E n implies the following Pure states are defined as those answering a maximal number of experimental questions i.e. such as that E p is a maximal (non-trivial) left ideal in B(H). It is easy to show that for an arbitrary maximal left ideal E p there exists a one-dimensional linear subspace V<^H such that E p consists of operators vanishing on V: E p ={FeB(H):F\V=0} This implies p to be a projector mapping H onto V and pp=p (117) (118) Similarly as in classical statistics information (entropy) then takes its maximal (minimal) value Tr(plnp) = (119) All the above equations in B(H) are formally analogous to the classical ones in C{I). However, the non-commutativity of B(H) involves us in serious physical differences. In fact when p is a classical probability distribution describing a pure state (point measure; ^-maximal) then obviously, the quotient space C(I)/% P is isomorphic with the one-dimensional field of complex numbers C . Therefore the pure ensembles of classical statistics are characterized by definite values of all random variables; they satisfy eigenequations (109) for any A 6 C(I). This is not the case in quantum theory. Even if peB(H) is a pure state (E p is maximal) there exist physical quantities which take no definite value on p. A physical quantity A is spread-free on p if and only if the image V of p (i.e. V=p(H) is an eigenspace of A : A\V=aId v There exists one more peculiarity of quantum statistics which has no counter- part in the classical probability calculus. It is also strictly related to some formal 178 Uncertainty Principle and Foundations of Quantum Mechanics algebraic differences between C(I) and B(H). In fact, the only algebraic structure carried naturally by C(I) is that of commutative associative algebra over the complex field C. On the contrary, the non-commutative associative product in B{H) (superposition) gives rise to another non-trivial algebraic structure namely that of Lie algebra under the quantum Poisson bracket: [A,B] = ±-.(AB-BA) (120) which is skew-symmetric and satisfies the Jacobi identity: [[A,B],C]+[[B,C],A] + [[C,A],B] = (120a) As we have mentioned above all informative essentially statistical properties of statistical ensembles were described in terms of the associative algebra struc- ture (ideals, E p , % p , eigenvalues, eigenequations and so forth). By contrast the structure of Lie algebra in B (H) is strictly related to the symmetry properties of statistical states. In fact, any Hermitian element A eB{H) gives rise to two physically distinct kinds of physical operations: (1) One can measure A on a statistical state p and perform a statistical analysis of the spread of the results. The corresponding mathematical expression describing the statistical dispersion is given as: cr 2 (A,p) = Tr(A 2 p)-(Tr(Ap)) 2 (121) Spread-free eigenensembles satisfy the operator eigenequation: Ap = pA = ap (122) Hence, informative, statistical notions are related to the associative algebra structure. (2). A gives rise to the one-parameter group of unitary automorphisms of the theory: B(H)9B^B I = exp[^A]Bexp[-^A]eB(/f) (123) An infinitesimal description of this group is given in terms of the Lie algebra structure: dt B, = [B„A] (124) B = B, hence: > = [£,A] (=0 Automorphisms (123) preserve all laboratory measurable quantities^ i.e. expectation values and probabilities of detection, [Tr(Ap), Tr(p!p 2 )J. This is because they preserve both the trace and the associative product that gives ris Stawianowski 179 to the following compatibility conditions for the associative and Lie algebraic structures: [A, BC] = [A, B]C+B[A, C] (125) As we have mentioned in the previous chapters from the laboratory point of view information and symmetry properties are essentially independent of each other. Nevertheless in Hamiltonian quantum mechanics they become interre- lated namely information implies symmetry. In fact the operator eigenequation (122) implies that [A,p] = Therefore the lack of statistical dispersion of A on the ensemble p implies that p is invariant under the one-parameter unitary group generated by A. This implies that ReiJ p that is the real linear subspace of E p composed of the Hermitian operators is a Lie algebra over the field of real numbers. This is true for an arbitrary density operator peB(H). To summarize the above considerations we compare now the main algebraic and physical features of quantum statistics and classical probability calculus: (1). If p is a classical probability distribution then % p is an associative ideal in C(I) (cf. (108)). This structure gives an account of the informative properties of p. (2). If peB(H) is a density operator then: (a) E p defined in (115) is a left ideal in the associative algebra B(H). This structure describes the informative properties of p. (b) Re E p is a real Lie subalgebra in the Lie algebra B (H) (under the quantum Poisson bracket). This structure gives an account of the symmetries of p implied by its informative properties. Now let us investigate the corresponding structures in classical Hamiltonian mechanics. In contrast to the previous sections we will not assume the affine geometry of degrees of freedom. Hence we must appeal to the general formulation of mechanics based on symplectic geometry (Abraham, 1967, Arnold, 1971; 1974; Sternberg, 1964; Hermann, 1970; Kostant, 1970; Souriau, 1970). Let us start with some mathematical preliminaries: The classical phase space of a mechanical system is a pair (P, y) where P is an analytic differential manifold (the set of classical states of a system) and y is a non-degenerate and closed differential two-form on P of class C"{P): if (y, X®Y) = for an arbitrary vector field Y, then X = dy = Making use of local coordinates f " on U^P, we have: r|t/=ra6d^ a Ad£* (126) (127) 180 Uncertainty Principle and Foundations of Quantum Mechanics where: dettoJ^O (126a) y a „c + y^ + TW = (127a) Hence dimP = 2n where n is an integer, the so-called number of degrees of ^TteDarboiix theorem (Abraham, 1967; Sternberg, 1964) implies the existence of canonical coordinates C such that: (128) "HI-/ oil [compare with expression (4)]. One uses then the historical notation: (O- (q\ Pi) and: y |L7 = dp j Adq" (128a) The contravariant skew-symmetric tensor reciprocal to y will be denoted as y : -8° (129) ~bc yab7 : ab We will also write y ab simply instead of y" . The Poisson bracket of the differentiable functions F, G is defined as: {F,G} = <dF®dG,f> (13°) In the coordinates: (130a) ^ dF dG It is skew-symmetric and (127) implies that the Jacobi identity holds hence the Poisson bracket turns C°(P) (and (TiP)) into Lie algebra. Now let F be a differentiable function of P and dF— its differential. Raising the index' of the Pfaff form dF by means of y, one obtains a contravariant vector field on P, the^sp-called Hamiltonian vector field generated by F. One denotes it as dF: (y, dF®Y)= -(dF, Y); in the coordinates: The components of dF with respect to the canonical coordinates (q\ p t ) are given by: (dF _dF\ The tensor y gives rise to the skew-symmetric scalar product and skew- symmetric orthogonality of vectors. We say that two vectors attached at the same point p e P are y -orthogonal if: (y P ,u®v) = y pab u a v" = (131) Stawianowski 181 Obviously, all vectors are self -orthogonal (skew symmetry of y). Hence similarly as in the previous section we can define I-class submanifolds and lagrangian submanifolds in P: The submanifold M <= P is said to be of I-class if any vector y-orthogonal to M is at the same time tangent to M. M<=F is called isotropic if any vector tangent to M is at the same time y-orthogonal to M. Mc? is said to be Lagrangian when it is both I-class and isotropic. Obviously, the dimension of lagrangian submanifolds equals the number of degrees of freedom, n = 5 dim P. Now, let: M = {peP = F(p) = 0, i = l...m} M is a I-class submanifold if and only if the Poisson brackets of F, vanish weakly: {F,F ; }|M = i.e.: {F h F,}=C%F k (132) for some functions C,y (Dirac, 1964). Any I-class submanifold M of the co-dimension m is foliated by a family K(M) of m -dimensional isotropic submanifolds in such a way that all vectors tangent to this foliation are y-orthogonal to M (Tulczyjew, 1968; Slawianowski, 1971, Dirac, 1950; Bergmann and Goldberg, 1955). K(Af) is called the characteristic or singular foliation of M. Some global properties of the characteristic foliation are important in quasiclassical problems. The I-class submanifold M is called simple when its quotient set P(M) = M/K(M) carries the natural 2(n-m) dimensional differential structure of class C* projected from M. When M is simple, then any smooth function on P(M) gives rise to some non-trivial function on M which becomes constant when restricted to any fibre of K(M). For example when M is a value surface of the Hamiltonian H then the corresponding dynamical system is completely degenerate and admits (2n - 1) non-trivial independent and autonomous constants of motion (the Hamiltonian itself included among them). The quotient manifold P(M) of a simple submanifold M carries natural phase space structure, because y\M is projectable from M to P(M). The resulting phase space (P'(M), y') is called the reduced phase space of M; it describes gauge-free physical degrees of freedom of Af. Now let us describe an opposite case: I-class submanifold M is said to be primitive when the only smooth functions on M constant on fibres of K(M) are those constant all over the whole M. For example the value surfaces of an ergodic Hamiltonian are primitive submanifolds; the only non-trivial constant of motion is the Hamiltonian (energy) itself. Obviously the arbitrary lagrangian submanifold is both simple and primitive. 182 Uncertainty Principle and Foundations of Quantum Mechanics The skew-symmetric tensor y gives rise to the nowhere vanishing differential form of maximal possible degree 2n, namely: yA ... Ay (133) It is well-known that such a form gives rise to some measure on P (Abraham, 1967; Schouten, 1954; Siawianowski, 1975). It is convenient to divide it by (2Trh) n = h"(ci Section 2) ; the corresponding dimensionless measure on P will be denoted as p„ or simply p. Obviously when using canonical coordinates (q 1 . . . p n ), the measure p consists in j/d / * = (^)"j/d< ?1 ...d< ? "d Pl ...dp„ (134) for an arbitrary smooth function / the support of which is contained in the domain on which coordinates fa 1 ...*) are defined. The existence of a canonical measure enables us to describe statistical ensembles in P by means ot non-negative scalar functions or distributions p normalized to unity: [pA*AdpsO for any A I p dp = 1 (135) (136) Physical quantities are described by analytic functions on P. The linear space of all analytic functions on P, C(P) carries two natural algebraic structures: (a) C(P) is an associative algebra (obviously a commutative one) under the pointwise product; (b) <T(P) is a Lie algebra under the Poisson bracket operation. . In contrast to the quantum case, the associative and Lie algebraic structures in C(P) are algebraically independent which gives rise to the separation of information and symmetry properties of statistical ensembles. Nevertheless, they are compatible in the sense that the Lie structure gives rise to the derivations of associative structures, i.e. the Leibniz rule is satisfied: {A,BC} = {A,B}C+B{A,C} (137) which is an obvious counterpart of (125). Hence just as in quantum theory an arbitrary physical quantity A e C {P), A* = A can be related to two kinds of physical operations: measurements and transformations. . Statistical analysis of A is based on the well-known formulas of classical probabilistic calculus involving associative algebraic structure on C(P): (A)„ = | Ap d M ( 13g ) a 2 (A, p) = \ A 2 p dp. - (| Ap dp)' = (A\-(Af p (139) Slawianowski 183 Spread-free statistical ensembles of A satisfy the eigenequation Ap = ap (140) A one-parameter group of automorphisms of C°{P) {canonical transforma- tions) generated by A consists of transformations B^>B, such that dt B t = {B„A}, B = B (141) hence s* = {B,A} r=0 Such transformations preserve both the associative product (due to expression (137) and its Poisson bracket (due to the Jacobi identity). Hence, they preserve all measurable quantities of the theory. Now we are able to investigate classical counterparts of quantum ideals E p and pure states in some detail. Let us start with some definitions. An ideal V <= C* (P) in the associative algebra C° (P) is said to be probabilis- tic if there exists such a subset M <= p that V={FeC°(P):F\M = 0} (142) Such an ideal is denoted as V(M). Let p be an arbitrary probability distribution and let us put C(P)^9„={FeC( t P):Fp = 0} (143) Obviously i" p = y(Suppp) (144) which justifies the name we have used above. Such an ideal describes uniquely the space of all physical quantities which are dispersion-free on a given statistical ensemble. Any ideal in associative algebra C"(P) is contained in some probabilistic ideal. As an example let us mention an ideal of all functions which vanish on a given subset, together with their derivatives up to some fixed order. An associative ideal V<^C°{P) is called self -consistent when it is at the same time a Lie subalgebra of C°(P). A probabilistic ideal g p is self-consistent if and only if Supp p is a first-class submanifold of P. When physical quantities F, Ge.C (P) are simultaneously spread-free on p and £ p is self-consistent, then {F,G}|Suppp = 0, i.e.{F,G}6tf p (145) These are just the classical compatibility conditions. In particular, when both (q\ p^ are dispersion-free on p, then ^ p fails to be self-consistent. Hence, the geometric a priori of symplectic manifolds seems to anticipate on the purely classical level the Heisenberg uncertainty principle. 184 Uncertainty Principle and Foundations of Quantum Mechanics In the classical probabilistic calculus of discrete sets we had no compatibility restrictions; all probability distributions and all ideals % p were admissible and consistent. This is not the case in Hamiltonian mechanics. The correspondence-principle analysis based on information and symmetry sug- gests that the only physically justified probability distributions p on P are those for which ideals % p are self-consistent. Hence, the classical probability distribu- tion the support of which fails to be a I-class submanifold is only a technically convenient shorthand for probability distributions closely concentrated around the 'support' but essentially smeared out beyond it. Hence, the lowest possible dimension of the support of the probability distribution satisfying classical compatibility and the uncertainty restrictions, equals the number of degrees of freedom (the lowest dimension of an I-class submanifold). In particular, point measures are incompatible. As we have mentioned in the previous section classical counterparts of pure states are probability distributions p for which % p is a maximal self- consistent probabilistic ideal. In particular, any probability distribution the support of which is a connected closed lagrangian lagrangian submanifold is a quasiclassical pure state. Probability distribution p[D, S] (76) is the most typical example because of its obvious relationship with the wave functions ¥ through their phases S and their moduli D. However it is interesting that Supp p need not be a lagrangian submanifold to be able to ensure the maximality of the probabilistic ideal % p . In fact when p is an arbitrary distribution the support of which is a primitive submanifold (cf . definition above) then £ p is maximal provided Suppp is analytic and con- nected. Although such a distribution is pure in the sense of answering the maximal number of compatible questions (measurements), it is hard to relate it to any wave function. The problem of quasiclassical interpretation of such distributions from the point of view of the correspondence principle is still open. As a typical example we refer to the microcanonical ensembles of ergodic Hamiltonians. Hence one can hope that such distributions are in some sense related to the quantum ergodic theory (Ludwig, 1961). Remark: Let M be an arbitrary submanifold of P not necessarily a self- consistent one. There exist self -consistent ideals in C(P) (associative ideals being at the same time Lie algebras) all elements of which vanish on M. However they are of a non-probabilistic type; any such ideal is a non-trivial proper subspace of V(M). A typical example is V(M) = {feC(P) : f(p) = 0, d/ p = 0, p eM} (146) When p is a classical probability distribution and Supp p is a primitive analytic submanifold, then obviously g p =V(Suppp) is a maximal self-consistent probabilistic ideal and p is a conceptual counterpart of a pure quantum state. However if we omitted the word 'probabilistic' in the above statement it would become false because there exist essentially larger self-consistent ideals Stawianowski 185 of a non-probabilistic type. In fact, let peP and let U p <= TpP be some n -dimensional isotropic subspace of the tangent space at p, i.e. n -dimensional linear space of pairwise y-orthogonal vectors attached at p e P: (y p , w®v) = y P a b w a v b = provided w, v e U p . Now let us put S(U P ) = \feC°(P) :f(p) = 0, d/ p e U p ) (147) (148) One can easily show that S(U P ) is a self-consistent ideal in C° (P) moreover it is a maximal self-consistent ideal. Let M<=P be an arbitrary lagrangian sub- manifold and let peM. Then obviously V{M)^S{TpM) (149) and this is a non-trivial proper inclusion. Hence the point measures in P are related to maximal self -consistent ideals of a non-probabilistic type in C° (P) ; Let S p be a Dirac measure concentrated at peM. There exists a system of In generators of S(U p ) namely F x . . . F 2n such that: {F h 8 p } = (150) Nevertheless, £(5 P ) is essentially larger than S(U P ) and this is why the point measures violate the relationship of information and symmetry suggested by the correspondence principle, although the 2n invariance conditions = {F h Sp] are satisfied. Let us finish our chapter with some remarks concerning the quasiclassical description of projectors in terms of symplectic geometry. We present only general ideas; more detailed information is given in (Stawianowski, 1971; Stawianowski, 1975). As we have mentioned above, there exists an exact relationship between lagrangian submanif olds in P and phases of quasiclassical wave functions. It is a well-known peculiarity of the quasiclassical WKB- approximation that all relations between phases become separated so as to satisfy some autonomous closed algebra quite independent of what happens to the moduli of the wave functions. Let D(P) denote the set of all closed lagrangian submanif olds in P. Now let M <= p be a simple closed submanifold of P (cf. the definition above) and let D{M) <= D(P) denote the set of all closed lagrangian submanif olds of P contained in M (i.e. JteD(M) if and only if M<^M). D(M) is non-empty because M is a I-class submanifold (compatibility conditions satisfied by quasiclassical wave functions). One can show that except for some special cases of singular intersections there exists for an arbitrary MeD(P) only one (AmJI) € D(M) such that C^nMKtAjwJO (151) 186 Uncertainty Principle and Foundations of Quantum Mechanics The natural mapping A M : D(P) -*D(M) satisfies the following rules: (1). It is a retraction of D(M): \ M \D(M) = id D( M) ( 152 ) in particlar, it is idempotent: A M °A M = A M (153) (2). When M, N, Mf)N are simple and closed, then: A m °Ajv = A N °A M = A M rw (I 54 ) (3). When A M °A N = A N °A M (155) then M ON is a simple submanifold and (154) is satisfied. (4). When / : P -* P is an arbitrary canonical mapping then: A /(M ) = -FoAm° j F'" 1 ( 156 ) Where F:D(P) -*D{P) is a mapping in D(P) induced in an obvious way by /. (By canonical mapping we mean here an analytic diffeomorphism of P, preserving y.) A M describes quasiclassical projection on to a subspace of quasiclassical wave functions characterized by definite sharp values of those physical quan- tities which are described by smooth functions, constant on M. Example: Let us consider an afline phase space with canonical affine coordi- nates (q\ p^, obviously: y = dp t a dq'. Let us put: (157) (158) Then M = {peP:pi(p) = b) D{P)BM={peP:q i (p) = a i , i = l...n} A M M = {peP:pi(p) = b,q 1 {p) = a 2 ,...,q n (p) = a n } (159) Hence fixation of the value of p x by means of A M results in a complete indeterminacy of q l (quasiclassical uncertainty principle). Quasiclassical theory becomes complete when besides the projectors A M , classical Huyghens-Fresnel superpositions are taken into account (cf. the definition at the end of the previous section). The actual definition and properties of A M are based on the geometric a priori of a symplectic manifold endowed with the second order Pfaff form y with a local description dp, a dq'. Similarly, the geometric structure of the Huyghens-Fresnel superposition can be deduced from the geometric a priori of the so-called contact manifolds, the geometry of which is based on the first-order Pfaff form ft with a local representation: -dz +p, dq' (Stawianowski, 1971). The contact manifold is a fibre bundle over symplectic manifold with a one-dimensional fibre. Roughly Stawianowski 187 speaking it arises from the classical phase space when the action variable (i.e. the phase of a quasiclassical wave function) is taken into account as an additional dimension (Souriau, 1970; Arnold 1974). REFERENCES Abraham, R. (1967) Foundations of Mechanics, Benjamin, New York. Arnold, V. I. (1971) Obyknovennyie Differetsialnyie Uravneniya, (Ordinary Differential Equa- tions, in Russian), Nauka, Moscow. Arnold, V. I. (1974) Matematicheskie Metody Klassicheskoj Meckhaniki (Mathematical Methods of Classical Mechanics, in Russian), Nauka, Moscow. Arnold, V. I. and Avez, A. (1968) Ergodic Problems of Classical Mechanics, Benjamin, New York. Bargmann, V. (1954) 'On unitary ray representations of continuous groups' Ann. Math., 59, 1-46. Bergmann, P. G. (1966) 'Hamilton-Jacobi and Schrodinger theory in theories with first-class Hamiltonian Constraints', Phys. Rev., 144, 1078-1080. Bergmann, P. G. (1970) Quantisation of the Gravitational Field, Aerospace Research Laboratories-Report ARL 70-0066. Bergmann, P. G. and Goldberg I. (1955) 'Dirac bracket transformations in phase space', Phys. Rev., 98, 531-538: Born, M. (1925) Vorlesungen iiber Atommechanik, Springer, Berlin. Born, M., Heisenberg, W. and Jordan P. (1926) 'Zur Quantenmechanik. II', Z. Phys., 35, 557-615. Born, M. and Jordan, P. (1925) 'Zur Quantenmechanik', Z. Phys., 34, 858-888. Born, M. and Wolf, E. (1964) Principles of Optics, Pergamon Press, London. Caratheodory, C. (1956) Variationsrechnung undpartielle Differentialgleichungen erster Ordnung, B. G. Teubner, Leipzig. Dirac, P. A. M. (1950) 'Generalized Hamiltonian dynamics', Canad. J. Math., 2, 129. Dirac, P. A. M. (1951) 'The Hamiltonian form of field dynamics', Canad. J. Math., 3, 1. Dirac, P. A. M. (1958a) 'Generalized Hamiltonian dynamics', Proc. Roy. Soc. London, A 246, 326-332. Dirac, P. A. M. (1958b) 'The theory of gravitation in Hamiltonian form', Proc. Roy. Soc. London, A 246, 333. Dirac, P. A. M. (1964) 'Hamiltonian methods and quantum mechanics', Proc. Roy. Inst. Acad. Sect. A, 63 49-59. Erdelyi, A. (1956) Asymptotic Expansions, Dover, New York. Froman, N. and Froman, P. O. (1965) JWKB Approximation, North-Holland Publishing Co., Amsterdam. Heisenberg, W. (1925) 'Uber Quantentheoretische Umdeutung kinetischer und mechanischen Beziehungen', Z. Phys., 33, 879-893. Heisenberg, W. (1927) 'Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik', Z Phys., 43, 172-198. Hermann, R. (1970) Vector Bundles in Mathematical Physics, Benjamin, New York. Kostant, B. (1970) Lecture Notes in Mathematics, Springer, New York. Landau, L. D. and Lifschitz E. M. (1958) Quantum Mechanics, Pergamon Press, London. Ludwig, G. (1961) 'Axiomatic quantum statistics of macroscopic systems (ergodic theory)', in Ergodic Theories, Proceedings of the International School of Physics 'Enrico Fermi', XIV Course, P. Caldirola, Ed., Academic Press, New York. Mackey, G. W. (1963) The Mathematical Foundations of Quantum Mechanics, Benjamin, New York. Messiah, A. (1965) Quantum Mechanics, North-Holland Publishing Co., Amsterdam. Moyal, J. E. (1949) 'Quantum mechanics as a statistical theory', Proc. Cambridge Phil. Soc, 45, 99. Schouten, J. A. (1954) Ricci Calculus, Berlin. Schouten, J. A. and Kulk, W. (\949)*Pfaffs Problem and its Generalizations, Clarendon Press, Oxford. Schrodinger, E. (1926a) 'Quantisierung als Eigenwertproblem, I', Annln. Phys., 79, 361-376. Schrodinger, E. (1926b) 'Quantisierung als Eigenwertproblem, II', Annln. Phys., 79, 489-527. Schrodinger, E. (1926c) Uber das Verhaltnis der Heisenberg-Born-Jordanschen Quanten- mechanik zu der Meinen', Annln. Phys. 79, 734-756. 188 Uncertainty Principle and Foundations of Quantum Mechanics Schrodinger, E. (1926d) 'Quantisierung als Eigenwertproblem, III', Annln. Phys 80, 437-490. Schrodinger E. (1926e) 'Quantisierung als Eigenwertproblem. IV, Annln. Phys. 81, 109-139. Schwartz, L. (1950-1951) Theoriedes distribution, Hermann Pans. SJawianowski, J. J. (1971) 'Quantum relations remaining valid on the classical level , Rep. Math. SlSnowskU. J. (1972) 'Geometry of Van Vleck Ensembles', Rep. Math. Phys., 3, 157-172 SJ-awianowski, J. J. (1973) 'Classical pure states. Information and symmetry in statistical mechanics', Inf. /. Theoret. Phys., 8, 451-462. SJawianowski, J. J. (1974) 'Abelian groups and the Weyl approach to kinematics , Rep. Math. S^wianowski, J. J. (1975) Geometria PrzestrzeniFazowych (Geometry of Phase Spaces, in Polish), Polish Scientific Publishers, Warsaw. . Souriau J. M. (1970) Structures des Systemes Dynamiques, Dunod, Pans. Sternberg. S. (1964) Lectures on Differential Geometry, Prentice Hall New York. Synge, J. L. (1953) 'Primitive quantization in the relativistic two-body problem , Phys. Rev., V9, Syige",' J. L. (1954) Geometrical Mechanics and de Broglie Waves, Cambridge University Press, Cambridge. Svnee,J.L. (1960) Classical Dynamics, Springer, Berlin. ...... A ♦• , „. Sniatycki, J. and Tulczyjew, W. M. (1971) 'Canonical dynamics of relativistic charged particles , Ann. Inst. Henri Poincare, XV, 177-187. Tulczview, W. M. (1968) Unpublished results. Van Vleck, J. H. (1928) 'The correspondence principle in the statistical interpretation of quantum mechanics', Proc. Nat. Acad. Sci., 14, 178-188. Weinstein, A. (1971) 'Symplectic manifolds and their lagrangian submamfolds , Advan. Math., ft, Weinstein, A. (1973) 'Lagrangian submanifolds and Hamiltonian systems' Ann. Math., 98, 377-410. Weyl H (1928) 'Quantenmechanik und Gropentheorie', Z. Phys., 46, 1. Weyl' H (193 1) The Theory of Groups and Quantum Mechanics, Dover, New York. Wign'er, E. (1932) 'On the quantum correction for thermodynamic equilibrium', Phys. Rev., 4M, 749.' : 11] A Theoretical Description of Single Microsystems GUENTHER LUDWIG Universitaet Marburg, Germany The fundamental relation for the interpretation of quantum mechanics is: m=ti(WE) (1) E being a projection operator in a Hilbert space 'M and W being a self-adjoint operator with W> and tr ( W0 = 1. As it is well known, the trace in (1) can be calculated by tx{WE) = Y.{<f>^WE<i> v ) v d> v being any complete orthonormal set of vectors in $f ;(...,.. .) denotes the inner product in $f. The real number m in (1) satisfies O^m^l. The funda- mental statistical interpretation of quantum mechanics is as follows: m is the probability of E. Better known than the general form (1) is the special case W=P 4> , P^ being the projection operator which projects onto the one-dimensional subspace of $f spanned by the vector <p(\\<f>\\ = 1). Then (1) takes on the form m=tr(P 4 E) = (< P ,E<p) (2) Any experimental test of quantum mechanics employs the relation (1). A description of quantum mechanics based on the general formula (1) is given in (Ludwig, 1975) and in a more consequent way in Ludwig (in prepara- tion a). More general, but reducible to (1), is the well-known interpretation that tr(WA) (3) (and tr (P^A) = (<£, A<f>) in the case of W=P 4> ) is the expectation value of the observable A (in expression (3) A is a self-adjoint operator). The Heisenberg uncertainty relation is nothing other than the physical interpretation of the mathematical theorem that Heisenberg's commutation relation PQ - QP = (h/i)l, holding for the position and momentum operators Q and P, yields AP-AQ>J (4) 189 190 Uncertainty Principle and Foundations of Quantum Mechanics with AP and AQ denned by AP 2 = tr(W(P-al) 2 ) AQ 2 = tr(W(Q-pi) 2 ) a=tr(WP),P = ti(WQ) The physical interpretation of expression (4) depends on the physical interpretation of (3). # In order to reduce expression (3) to (1), it is necessary to introduce the conception of 'simultaneous measurability', i.e. of 'commensurability'. Usually it is assumed that commensurable E x 's (E x being a projection operator in W) commute, i.e. E x • E p = E p • E k . By means of a family of commuting projection operators and a measuring scale it is possible to deduce (3) from (1). We will give such a deduction in Section 5. We may summarize: for the interpretation of quantum mechanics, we need the following concepts (the physical meaning of which must be exactly specified): (1). Probability denoted in expression (1) by the real number m. (2). A physical interpretation of the operator W in (1) (or the vector in <f> in (2)). The following expressions are often used: Wis the mathematical image of an 'ensemble' or of a 'state'. Sometimes the expression 'state' is used only in the case W= P*, calling <p the 'state of the system'. Wis also called the 'statistical operator' (or the 'statistical matrix', if given in matrix form). (3). E in (1) is considered as the image of a 'yes-no observable', a 'yes-no measurement'. Also words like 'questions' or 'propositions' are used. (4). A conception of 'commensurability'. Sometimes, instead of 'commensura- bility', one is speaking of 'measurability at the same time'. All discussions concerning quantum mechanics ultimately depend on the concepts mentioned above under points (1) through (4). Many misunderstand- ings are based on the fact that various authors use the same words for different conceptions. All mistakes, all paradoxa are based on inadmissable fictions attached to the expressions noted under (1) through (4). It is impossible to give here a survey of all the discussions concerned with the various concepts. Such discussions do not appear clear enough since, up to now, one has tried to clarify the meaning of the concepts under (1) through (4) using common language. This was necessary, because the description of single experiments and their results had to be given in common language. The gap between relation (1) and experiments is much too big to correlate immediately experi- ments and theory. One has to use common language to bridge this gap. This necessity may be seen if we want to describe an experiment on one individual microsystem, e.g. an individual trace in a cloud chamber. There is no term in the mathematical framework of quantum theory which could be used as an image of this individual experiment; neither the 'probability' m, nor the 'statistical operator' W, nor the 'observable' A. I Ludwig 191 Thus it seems only natural to use common language in describing single experiments. In the mathematical framework of quantum mechanics, we have the 'set of statistical operators', the 'set of projection operators', but there is no set the elements of which could be used as 'images' of individual microsystems. The concept of an 'individual microsystem' itself is not cleared up theoretically by quantum mechanics; it may be understood only intuitively. Someone might object that quantum theory contains terms usable as images for individual systems in the same way as classical mechanics does: in classical mechanics every individual system is described by a point in the T-space, the 'state space' of classical mechanics; in quantum theory every individual system is described by a 'state' 4>, <f> being an element of the Hilbert space %£. The surface of the unit sphere in Sff has to be taken as the 'state space' of quantum mechanics. In my opinion, this is a false interpretation of quantum mechanics; in any case, in quantum theory, it is generally impossible to determine the state <f> of an individual system in a single experiment. All these problems sketched above motivate the development of a more comprehensible mathematical framework for quantum theory, which opens the possibility to interpret this framework before using relation (1) and the concepts given above under points (2) through (4). The experiments on individual microsystems and the statistics of these experiments should be described directly by their respective 'images' in this new mathematical framework. Then the physical interpretation of quantum mechanics does not depend on the 'statistical operators' W and the 'observable operator' A ; the physical interpretation will depend only on two fundamental notions just as in all classical theories, namely on the notions of 'physical system' and of 'statistics'. The fundamental concepts of quantum mechanics, in this new form, will not be essentially different from those of any other classical theory. The interpreta- tion may be done in exactly the same way as in classical theories. The concepts mentioned above under points (2) through (4) will not be fundamental, but will be derived from the fundamental concepts of physical system and of statistics. Their interpretation will be given automatically by their deduction and by the interpretation of the two fundamental concepts. If only these two concepts are accepted as fundamental, all paradoxa and misunderstandings disappear; only mistakes will be possible but mistakes can be corrected. 1. THE CONCEPTION OF 'PHYSICAL SYSTEM' The conception of physical system is based on the possibility of making 'experiments' on such systems. The first step in such experiments is to 'pro- duce', to 'manufacture', to 'prepare' the systems. Examples of microsystems: an accelerator produces 'ions' ; another accelerator produces 'electrons', a third 'pairs of particles', using colliding beams. The last examples shows that an 192 Uncertainty Principle and Foundations of Quantum Mechanics individual system is not necessarily an elementary system (in the sense of not being composed). An individual system is only one of a large family produced by a preparing apparatus. After their production, the various systems can be 'recorded' experimen- tally. This recording may also be described by saying that 'something has been measured on the system'. Such recording of a system may be realized, e.g. by a trace in a cloud chamber, by a signal of a counter, etc. We will give examples for preparing and recording of microsystems: atoms can be produced (prepared) by a canal-ray tube and the photons emitted by these atoms may be recorded; electron-proton pairs can be prepared, and after the collision between electrons and protons the electrons may be recorded; nuclei can be prepared and (in the case of /3 -decay) the electrons emitted by the nuclei can be recorded. To show that the usual experimental procedures concerning classical systems have the same structure, we will give just one example: a gun may 'prepare' projectiles and the trajectory of the projectiles can be recorded; e.g. the point of impact as a part of the trajectory may be recorded. The examples show that the fundamental parts of 'experiments on physical systems' are the processes of preparing and recording. It is impossible to give here a more detailed description of preparing and recording procedures composing the experiments on systems. However, we want to stress that preparing and recording procedures may be described without referring to the physical systems that are prepared and recorded. A gun, for instance, may be described as a procedure (i.e. its construction and instruction for use) without referring to a special projectile fired by this gun; also the impact may be described without explaining the cause of this event. Also, in the case of experiments on microsystems, the experimentator can describe the apparatus used to prepare the systems and the apparatus recording them. The 'evaluation' of such experiments giving values of quantities defined by a theory of the system follows after this description; cross section, wavelengths, etc., are calculated only after the description of the experimental procedures and the recording of the response of the measuring apparatus has been given (e.g. by a computer). To give a mathematical picture of such experiments on individual systems, it is necessary to introduce a set M, the elements of which shall be 'images' of the systems. Given a special atom 'x', in an experiment, the relation xeM should be the mathematical form of the proposition: x is a physical system. However, the relation x e M reflects the proposition 'x is a physical system' only if the set M is endowed with a structure as an image of preparing and recording procedures. (A systematical description of this method of theoretical physics, employed here in a more intuitive way, is given in Ludwig (in preparation b). We will not discuss the question concerning the reasons which allow us to speak of given microsystems and, in this sense, of real microsystems of which the elements of M are images. The reader interested in such questions will find a detailed discussion in Ludwig (1970, 1972a) and in a very short form in Ludwig 193 (Ludwig, 1974a). In these references, a theoretical description is given how to 'recover' the microsystems, starting by the description of preparing and recording procedures only. However, to understand quantum mechanics it is not necessary to justify the existence of real microsystems, an existence which was more or less founded intuitively in the history. (A) SELECTION PROCEDURES According to the short sketches given above, the preparing and recording devices have a common structure, namely, that by these procedures physical systems can be selected. Therefore, it seems to be useful to begin with a mathematical description of this common structure. Physical as well as mathematical reasons motivate the introduction of a more general species of structure henceforth called a 'selection procedure': A subset ^c $>{M) (0>(M) the potential set, i.e. the set of all subsets of M) is called a structure of selection procedures or shortly a selection structure on M if the following axiom holds (b\a being the relative complement of a in b): AS1.1 a,beSf and a<=b^b\ae^ 2 a,ie^ani€y A physicist would like to 'understand' why we have postulated AS 1.1, 2. However, it is more or less difficult to 'make plausible' this rather general conception of selection procedures. Therefore we can only give some hints: If Me if, AS 1.1, 2 has the consequence that & would be a Boolean algebra of sets. We have not postulated Me tf, nevertheless, the assumption Me & would not lead to mathematical contradictions in the following. However, it seems to us that the postulate Me y would be unrealistic on physical grounds. To see this, we will try to elucidate the physical significance of AS 1.1, 2. The physical interpretation of'xea and a e 9" is as follows: the physical system x has been selected by the selection procedure a. In this sense, an element aeif represents the method of selecting as well as the family of physical systems selected by this method. If there are two selection procedures a and b given by special physical methods, it is not difficult to construct the following selection procedure: select all x e M which are selected by a as well as by b, i.e. the set a (lb. This is the meaning of AS 1.2. Of course, a D b = is possible, namely, in the case when there are no systems which can be selected by a as well as by b, i.e. if the selection procedures a and b are incompatible. If a <= b, the selection procedure a is called finer than b. If a is finer than b, and if one selects by the method a all x of b, the remaining systems of b are those of b \a ; AS 1 . 1 says that the selection of these remaining systems is also a selection procedure. 194 Uncertainty Principle and Foundations oi Quantum Mechanics The following two examples show the unrealistic feature of the postulate: Me &. , , Consider an apparatus producing steel balls. This apparatus is an example of a selection procedure for steel balls. Let us denote this selection procedure by «„' Then a cM and a e^ hold, m being the 'set of all steel balls'. The set M\a is then characterized as the set of all those elements which have not been selected at all. The knowledge of the construction of the machine makes it possible, to specify various properties of the systems of a. However, there are not properties which can be ascribed to the systems of M\a. Therefore, we have not included M\a in the set of selection procedures. We have a similar situation in the case of an accelerator of electrons. The knowledge of the construction of the accelerator yields very essential informa- tion on the electrons produced by this apparatus. Such information is necessary for any physicist who wants to perform experiments on these electrons. However, what can be said about all those electrons which are not produced by this particular accelerator? In the following, let Sf{a) be the abbreviation for the set: Sf{a) = {b\beSe&nAb^a} As a consequence of AS 1.1, 2 <f{a) is a Boolean algebra of sets, a being the unit element of !f(a). To every set 2 c 0>(M) there exists a smallest set Sf of selection procedures with Sf => 2. Sf is called the set of selection procedures generated by 2. (B) STATISTICAL SELECTION PROCEDURES In this section we want to give a mathematical formulation of the second fundamental concept of quantum mechanics, i.e. the concept of statistics, of probability. Many authors consider the familiar mathematical probability theory as sufficient for the foundation of quantum mechanics. Other authors state that the quantum mechanical probability given by (1) cannot be formu- lated in the framework of the familiar mathematical probability theory. Indeed, this last opinion will prove wrong as we will see in the following. There are two reasons why we will recall now some essential aspects of the familiar mathematical probability theory: Reason 1: The axioms of mathematical statistics are so simple that we are able to list them in order to give a complete survey of all fundamental concepts of quantum mechanics. Reason 2: We will formulate the axioms in a more 'physical' form, i.e. in a form more suitable for describing experiments with physical systems. In the results of experiments, probability appears in the form of frequencies with which a selection procedure b finer than a selects systems of a. Ludwig 195 In other words: if, in an experiment, N systems x u . . . x N e M are selected by the procedure a and if one selects out of these N systems those N' systems which fulfil also the conditions of b, the number N'/N is called the frequency with which b has selected systems of a. In many (not in all!) cases the experiments show a 'reproducibility' of this frequency, i.e. if one repeats experiments employing selection procedures a and b, one obtains nearly the same frequencies if N and N' are 'large' numbers. If such a reproducible frequency exists, we say that b depends statistically on a. To give a mathemati- cal description of such a statistical dependence we introduce a mathematical structure called 'statistical selection procedures' or shortly a 'statistical selec- tion structure' which is defined as follows: A set Sf <= 0>(M) is called a structure of statistical selection procedures if AS 1 holds and if a mapping A is given, mapping 3~ = {{a,b)\a,beS/>;a^b and a# 0} into the interval [0, 1] of real numbers and if the following axioms hold: AS1.1 a 1 ,a 2 eSf,a l na 2 =0,a 1 Ua 2 e9'^> A (ai U a 2 , ax) +A («i U a 2 , a 2 ) = 1 2 ci, a 2 , a 3 eSf, a t =>a 2 ^a 3 , a 2 *0^> A(a x , a 3 ) = k(a u a 2 )X(a 2 , a 3 ) 3 a u a 2 s.y, a^a 2 , a 2 ^0^k{a u a 2 )^d X(a,b) is usually called the probability of ' b relative to a. \(a,b) is the mathematical picture of the frequency with which b selects relative to a as described above. With this 'interpretation' of A (a, b) at hand, the reader may easily check the 'physical' significance of axioms AS 2, 1 to 3 (see also Ludwig (1975) and in preparation a). From AS 2.1 to 3 we obtain: A(a,a) = l, A(a, 0) = O; and for ai^>a 2 ,ai^a 3 and a 2 0a 3 = 0: A(a x , a 2 Ua 3 ) = A(ai, a 2 )+A(tf!, a 3 ) By /ju{b) = \(a, b), an additive measure on the Boolean algebra Sf{a) is defined. In the sequel, the following definition will be important: Definition: A decomposition a = U"=i b t of an aeif with b, ■,* 0, b t e Sf, biDb k = if i 5* k is called a 'demixture' of a into the 6,'s and a is called a 'mixture' of the bC%. A (a, b t ) is called the 'weight' of b t in a. From AS 2.1 through 3 we obtain lA(a,6,)=l 196 Uncertainty Principle and Foundations oi Quantum Mechanics (C) PREPARING PROCEDURES As we have mentioned in the beginning, we want to give a mathematical picture of the procedures by which physical systems are prepared. To this end, we introduce a structure on M(M being the set of systems) by a set <2 c 9>(M) (the elements of 2 shall be the pictures of the various preparing procedures), for which the following axiom holds: APS 1 SL is a statistical selection structure. The probability function defined by APS 1 will be denoted by A 2 . 'x e a ' is the mathematical form of the proposition: the physical system x has been prepared by the procedure a. (D) RECORDING PROCEDURES It is a bit more complicated to give a mathematical picture of the procedures by which physical systems are recorded. The recording process is charactenzed by two steps: (1). Construction and employment of the recording apparatus. (2). Selection according to signals which appeared (or did not appear on the recording apparatus employed. Accordingly, we define another mathematical structure on M by choosing two other subsets of 9{M)- 9i and 91. 9t and 9t satisfy the following axioms: 91 is a selection structure. 9t is a statistical selection structure. 9t ^9l. 6 e 9t, 6 ^ 6 e 3?o and 6 * 4>6 e ^o- To each b e 9t there is a b e 9i with b => b. APS 2 APS 3 APS 4.1 2 3 In order to describe the physical meaning of APS 2 through 4, we must say of what the elements of 9t and 9i are pictures. An element b e 9i represents the construction and employment of a record- ing apparatus. We may clarify it by an example: The constructed apparatus may be a Geiger counter; then b is the set of all those microsystems, to which this Geiger counter is employed, x e b is the mathematical form of the proposition: the Geiger counter b has been used to record x. This does not imply that a recording signal has been produced by x. Therefore, we call 9t the set of recording methods. The Geiger counter (mentioned above as an example), used to record x, can respond or not; b + may be the selection procedure for all those systems xe b to which the Geiger counter has responded; hence b + <=b . Correspondingly b may be the set of all those x e b to which the counter has not responded; hence ft_ = b \b + . b + and b- are elements of 9i. Generally 91 is the set of all those Ludwig 197 selection procedures which are finer than the procedures of 9t ; finer by virtue of the influence of the microsystems on the apparatus, represented by the elements of 9t . We express this briefly by saying: 9t is the set of all recording procedures. Concerning the axioms APS 2 through 4, we will make only short remarks; a more general discussion is given in Ludwig (1975 and in preparation a). APS 3 means that the statistical dependence between the various recording methods has nothing to do with the microsystems. In contradistinction to the elements of 9t , the selection procedures b e 9t depend essentially on the influence of the microsystems. For this reason we did not state in APS 2 that 91 should be a statistical selection structure. We may illustrate this situation with the example of a counter: In nature there are no reproducible frequencies A (b , b + ) for the response of the counter as such; \(b , b+) would describe frequencies independent of the surroundings of the counter. In reality, the frequency of the response of the counter depends essentially on its surround- ings. The probability function corresponding to 9t will be denoted by Ag^. (E) THE DEPENDENCE OF THE RECORDING ON THE PREPARING PROCESS The first physical problem is raised by the question: which preparing proce- dures and measuring procedures may be combined together. Unfortunately, this problem is not trivial. We define quite naturally: a e.2 and b ^9i are said to be combinable if a fl bo* 0. The combination problem amounts to finding axioms (as laws of nature in mathematical form) concerning the set <€ = {(a, b )\a &St,b o e9t o ,aCib o *0} A discussion of this combination problem would be beyond the scope of this paper (see Ludwig, in preparation a). Here it seems sufficient to give a very simple axiom — though this axiom, in fact, is not very realistic. (This simple axiom can be replaced by another more realistic one with essentially the same mathematical consequences.) If £' = {a\ae%a*0} 9t o = {b o \b o e9t o ,b o *0} we formulate as an axiom: APS 5 ' = .2'x< The central problem of quantum mechanics is the description of the statisti- cal dependence of the recording on the preparing process. To begin with, we define = {c\c = a D b and a e 3., b e 91} 198 Uncertainty Principle and Foundations of Quantum Mechanics An element c = a D b of © is the set of all systems x prepared by the procedure a and recorded by the procedure b. Let Sf be the smallest set of selection procedures for which 0cy. In general, neither Si <= Sf nor 9t <= Sf holds! Now we formulate the experience that the combination of preparing and recording procedures leads to reproducible frequencies: APS 6 & is a statistical selection structure. The probability function corresponding to Sf will be denoted by \<?. The three probability functions A 2 , A«„, As- cannot be independent for physical reasons. Physical experience suggests that there is no dependence of the recording methods b e 9l on the procedures a e Si. This fact is expressed by the following axiom: APS 7 fli, a 2 &Si;a 2 <=-a 1 ;boi, b 02 £9L b 2 c b i and ai D b 01 ¥= implies: 1 Ay(flin6 i, a 2 nftoi) = Aa(«i>«2); 2 A^CaiDftoi, flinAo2) = Ag8o(Aoi» ^02) Axiom APS 7 implies the important theorem (for proof see Ludwig, in preparation a): The function A<? is determined uniquely by A 2 and the special values \AaC\bo, aC\b) for a e Si', b e 9t' , b e $ and 6 <= 6 . If one looks at the various experiments, it is easy to see that only the values XAqClbo, af\b) are tested by experimental physicists. Only one example should be sketched: By a preparing procedure a pair of particles may be produced to perform a collision experiment. The particles are recorded after the collision by a recording method b . Let b (with b <= b Q ) be the recording procedure counting if b has responded by a certain signal, a D b are all those systems (i.e. pairs of particles!), which are prepared by a and for which the recording method b is employed, a D b characterizes the collision experi- ment, aC\b are all those systems to which the recording apparatus has responded. We define 9={(b , b)\b e9t' , be9t,b^b } and call 9 the set of all effect procedures. The following real function is defined onS'xf: At (a, /) = /*(«, (b , b)) = \Aa Db ,af) b) (5) The function fj.(a,f) defined by (5) plays a central role in the statistical description of physical systems, especially of microsystems. The axioms imply the following theorem (for proof see Ludwig, in preparation a). Ludwig 199 The function fi(a,f) satisfies the following relations: (1). 0</.(fl,/)sl. (2). To every a e Si', there is a f e 9 for which /j, (a, f ) = 0. (3). To every a e Si', there is a f x e 9 for which n(a, A) = 1. (4). Every demixture a = U, «, (see the end of Section 1(B)) implies m(U at, f\ =1 Xifi (a,, /) and < A, = A a (o, a,) < 1, 1 A, = 1 (5). b 01 ^b 02 ^b (b ou b 02 e 9Q and f x = (*„i, *), / 2 = (*<b, *) implies At (a, /1) = A^Aoi. b 02 )fi(a, f 2 ) for all a e S' (6). Every demixture Z> = li ^. (i-e. b t e%,b i nb k = 0ifii i k) implies (with /i = (b , bi)) n X fi(a,fi) = l, for all a eS' 1=1 (7). At(a, (^o> 6)) = is equivalent to a D b = 0. According to the following theorem (proof by H. Neumann, to be published in Ludwig, in preparation a), the statistical structure of the theory is completely described by the function At. Given A 2 , the conditions (1), (4), (5) and (7) for the function At (a, f) imply the existence of a uniquely defined probability function A^> with \ y (a r\b ,aC) b) = /x(a, (b , b)) Ag8 is determined by At (a, f) = n(a, (b , b)). By the formulation of the axioms APS1 through APS 7, we have reached our first aim, namely, the definition of the concept of physical systems: those components of experiments which are represented by elements of a set M (according to the mapping principles of the physical theory) are called physical systems if the set M is endowed with a structure Si, 9t, 9t such that the axioms APS 1 through APS 7 hold, and if the elements of Si, 91, 9l are pictures of preparing and recording procedures (according to the mapping principles of the theory). All probabilities concerning the outcome of experiments on a physical system are determined by the function At. However, this structure is not yet typical for microsystems, as may easily be seen in looking at the example of the gun (as an a e Si') and the impacts of the projectiles (as a b e 91). 2. ENSEMBLES AND EFFECTS The next essential step in the development of the theory will be the introduc- tion of the notions of ensembles and effects. We will introduce these notions on the basis of preparing and recording procedures, it is to be stressed that these notions do not agree in every respect with the customary intuitive usage of the words ensembles and effects. 200 Uncertainty Principle and Foundations of Quantum Mechanics In the discussion of problems in the interpretation of quantum mechanics, many difficulties arise due to the fact that, in using 'common language' for a description of experiments, usually no difference is made between preparing procedures and ensembles (or 'states'): a family of microsystems, prepared by a procedure a &£', is often called an ensemble or a set of systems in a 'state', where the ensemble or state is described by a statistical operator W. It is impossible to give here a survey of all misunderstandings and mistakes caused by not distinguishing between preparing procedures and ensembles. Only in the case of the so-called Einstein-Podolski-Rosen paradox, we will demon- strate (see Section 8) how the situation will be clarified by the use of the concepts introduced here. Also the notion of effects (also known as yes-no measurements or questions) is used in a different sense by various authors. It may be stressed here from the outset that in the Hilbert space representation of the theory the effects are not always represented by projection operators! Since the function fi is defined by (5) on the whole of the set 2L'y.&, the relation defined by li(at,f) = ii(a 2 ,ft for all /e^ is an equivalence relation on SL'\ a^ ~ a 2 . The notion of ensembles is defined by: Definition 1 : Let 3if be the set of all equivalence classes in SL'. An element of 3Hs called an ensemble (or a state); 3Hs called the set of ensembles (or states). The relation l i{a,f 1 ) = l i{a,f 2 ) for all a e<2' defines an equivalence relation fi~f 2 on &. Definition 2: Let SB be the set of all equivalence classes in $F. An element of SB is called an effect; SB is called the set of effects. The following theorem holds: For w e X, g e SB and a e w, f e g the equation fi(w,g) = fi(a,f) defines a real function /2 on WxSf. jx satisfies: (1). 0*fi(w,g)*l. (2). (L(w u g) = p.(w2,g)foTdl\geSe3>w 1 = w 2 . (3). iHyv, gi) = /t(w, g 2 ) for all w e SfC^g^ = g 2 . (4). There is a g e SB such that /Z (w, g ) = for all w e 3C. (5). There is a gi e SB such that fi(w, g x ) = 1 for all we3K. Definition 1 gives a precise definition of the notion of ensemble (or of state). Nevertheless, we shall give an explanation of this definition since the intuitive usage of the notion of ensemble does not always coincide with the notion defined in Definition 1. An ensemble w is not a set or a family of microsystems, since w is not a subset of M. w is a subset of 0>(M), an equivalence class the elements of which are Ludwig 201 subsets of M. It is an important feature of quantum mechanics that a class w has more than one element. We will demonstrate this in the case of the example given in Section 8. It should be stressed that we do not use the notion of ensemble to formulate the connection between experiment and mathematical theory, i.e. to intepret quantum mechanics. The interpretation of quantum mechanics given here depends only on the notions of preparing and recording procedures. Also, the notion of ensemble is not necessary for the statistical description which is already given by the function A y . The notion of ensemble is used only to analyse the structure which already has been founded in Section 1. The following definition proves useful in the subsequent analysis of the structure of the theory. Definition 3: The canonical mapping which maps an element a&SL' onto its corresponding equivalence class, w e 3if, will be denoted by <f> ; correspondingly, let iff be the canonical mapping of & onto SB. For f = (b , b), we also write iff(b , b) instead of tf/(f). In the following, we simplify the notation in writing /m instead of /I; the arguments in the function will show whether fi is defined on 2! x SF or 5if x SB. In this sense, the equality /*(«,/) = **(*(«).*(/)) holds. The relation (4) of Section 1(E) implies: If a = U< «. is a demixture of a preparing procedure a, we have for all g e SB n(<t>(a), g) = 1 Xifj,(<f>(ai), g) (6) with A, = A 2 (fl, a,), < A, < 1 and X,. A, = 1. We define, in analogy to equation (6): Definition 4: Let w e 3fC be an ensemble, let A, be a set of real numbers with :£ A, ^ 1 and £, A, = 1 and let w ( e 5ST be a set of ensembles such that for all geif /*(w,g) = SA,/*(M' I -,g) (7) holds; then (7) is called a demixture of w with respect to the components w, with weights A,. It is as essential as in the case of ensembles to avoid a false interpretation in the case of effects. Similar to the case of the notion of ensembles it is to be stressed that effects are classes of effect procedures (b , b). The mapping tj/, defined in Definition 3, maps several effect procedures onto the same effect. By applying the mapping t//, parts of the structure (02 o , &) may get lost in the image SS of if/. This is actually the case as we will see in discussing coexistent effects in 202 Uncertainty Principle and Foundations of Quantum Mechanics Section 5(B). An effect procedure is characterized by a recording method b and by a recording procedure b. Two recording methods b o 1 ' and b 2) differing in their technical specification together with different responses b m and b m can be representatives of the same effect g e if, i.e. The frequently used expressions 'yes-no measurement', for the element (b , b)oi&i is due to the fact that b represents the response of the apparatus b . A more or less precise conception of yes-no measurement is used by many authors. Unfortunately, various authors attach a different meaning to the words 'yes-no measurement'. Some people do not use this word for the experimental situation we are describing mathematically by the elements (b , b)oi&. Our mathematical description has the advantage that all possible misunderstandings may be avoided. Also, the word question instead of yes-no measurement is used by some authors, but not always in the same sense. Our definition is: (b , b)e&isa question, b the answer 'yes' and b \b the answer 'no'. In contradistinction to our definition, some authors use the words yes-no measurement or question to denote the elements of if which we called effects; some authors use these words only for the elements of a subset of if, which we shall call 'decision effects', in the following (see Section 3). In introducing the notions of ensemble (or state) and effect, we have taken a first step towards the 'usual' representation of quantum mechanics. The next step is the introduction of the concept of 'simultaneous measurement' and the concept of 'observable'. Again, these concepts will be defined with the use of the fundamental concepts introduced in Section 1. The interpretation of the theory has already been given in Section 1 and new concepts are not introduced for the interpretation, but rather for the sake of a structural analysis of the theory. However, before introducing new concepts, we will analyse in the next chapter the connection between the mathematical symbols introduced so far, and relation (1) of Section 1. 3. LAWS FOR THE PREPARATION AND RECORDING OF MICROSYSTEMS For a 'physical' approach to the structure represented by relation (1), it seems best to introduce axioms (i.e. physical laws in mathematical form) for the preparing and recording procedures and to deduce from these axioms the following theorem: the function (i(w, g) on 3fxif can be represented in the form n(w, g) = tx((aw)(Pg)) with injective mappings a, 0. We will formulate such axioms without a detailed physical discussion (cf., e.g., Ludwig, 1964, 1967a, b, c, d, 1968, 1970, 1971a, b) neither will we present the proofs of the theorems (see Ludwig, 1970, 1972b; Stolz, 1971 and Ludwig, in prepa- ration b). Ludwig 203 Since one always performs only a finite number of experiments (see the extensive discussions of the 'finiteness of physics' in Ludwig 1970, 1974b, and in preparation b), we assume that the sets M, SI, 3? are denumerable (this is equivalent to the assumption that the completions of these sets are separable if endowed with a physically meaningful uniform structure — see Ludwig 1970, and in preparation b). If M, Si, 3? are denumerable sets, so are % and if. The relations given in Section 2 imply the following theorem: There is a pair of real Banach spaces 38, 3H' (38' denotes the dual Banach space of 38) and an embedding of 3f into 38 and of if into 38' such that (1). The canonical bilinear form (x, y) defined on 38x38' coincides with fi(w, g) on 3Tx if, i.e. fj-(w, g) = (w, g)\ x ^ (2). 38 is a base norm space (see, for instance, Nagel, 1974), the base K being equal to co $f (co 3C denotes the norm-closed convex set generated by 3f£)\ the cone, generated by K is norm-closed; (3). The linear hull of if is <r(38', 38)-dense in 38'. Points (1) through (3) determine 38, and 38' uniquely up to isomorphism. 3K denumerable implies that % is separable. In the following, we will denote the bilinear form (x, y) by fi(x, y) because of point (1) of the theorem. 38 being a base norm space implies: ^' is an order unit space. Because of 0<ja (w,g)<l for we3K and geif, it follows 0<^(w, g)^l also for weK and geif, i.e. ifc[0, 1], [0, 1] being the order interval between the zero element and the order unit. We denote by L the set coif, the closure of co if in the a(38', 38 )-topology. Let 3 be the norm closure of the linear hull of if. We have 1 e 3. 3 is a separable Banach subspace of SB' (3 is also an order unit space). 3 is o-(38\ 38)-dense in 38'. K is 5^(38, ®)-precompact and a(9l, ®)-separable. 38 may be identified with a subspace of 3 ' (3' being the dual Banach space of 3, 3' is a base norm space), and consequently K may be identified with a subset of 3'. Let K be the a(3', ®)-closure_of K in 3'. K is cr(3', ®)-compact and L is (38', 38)-compact. For the sets K and L the theorem of Krein-Milman holds. The topologies o-(38', 38) and a(3\ 3) are of considerable physical signifi- cance: first, the topologies a(3', 3) and <r(3', if) are identical on K and K and the topologies o-(38', 38), or(38', K) and o-(3S', T) are identical on L. The topologies a(3\ L) on K (and T) and o-(38' T) on L (and if) are suited to distinguish through experiment between different ensembles and different effects, respectively. This should be clarified, at least to some extent, in the case of ensembles: ^(w u g) = fi(w 2 , g) for all g e if implies w r = w 2 . However, in experiments, an ensemble can only be tested by finitely many recording procedures, i.e. by finitely many geif and not by all geif). Likewise, the probability n(w, g) can be tested only with a finite error. Thus we see that the 204 Uncertainty Principle and Foundations of Quantum Mechanics inequalities \fi(wi,gi)-fi(w 2 ,gt)\<e (i = l,2, ...n) may be tested experimentally only.for a finite number of effects g u . . . g„ and a finite error e. These inequalities determine (for various e, n, g,) a neighborhood base of the topology tr(3l', S£). To formulate the laws for preparing and recording we give some definitions: Definitions: K o (B) = {w\weK,ti(w,g) = for all geB^L} K t (B) = {w | w g K, /t (w, g) = 1 for all g e B <= L} L o (A) = {g\geL,v(w,g) = for all we A cK} A" (.B) and K t (B) are closed faces of K, L (A) is a closed face of L. If B has only one element g, we write intead of K (B) simply -K (g) and similarly for K t and L - It may be easily seen that the ordering y x < y 2 in 58' is equivalent to the relation fi(w,y 1 )^fi(w,y 2 ) for all we A" Let us formulate the first law (concerning recording procedures) by the axiom: AV 1.1 To every pair g u g 2 eL there exists a g 3 eL such that g 3 >gi, g3>&> and K ( gl )nK (g 2 )<=K (g 3 ). AV 1. 1 is equivalent to the following statement: Every set L {A) has a greatest element (see Ludwig, 1970, and in preparation b) ; this greatest element may be denoted eL (A). AH elements of K (not only of X) will be called ensembles (or states), all elements of L (not only of i?) will be called effects. The elements of the form eL (A) will be called decision effects. The set of all decision effects will be denoted by G. AV 1.1 implies G c d e L (for proof see Ludwig, 1970 and in preparation b), d e L being the set of extremal points of L. Since, for a subset {A a } of 0>(M), the relation L (U« A J = f\ £o(A«) holds, the set {L (A) | A <= K} is a complete lattice, the order relation being given by the set theoretical inclusion. The mapping Lo(A)-»eL„(A) is an order isomorphism of {L (A ) \ A <= K} onto G ; hence also G is a complete lattice with respect to the ordering induced on G by the ordering of 58'. Let L be the a(3S', 33)-closure of {y | y = Ag, A 6 R, g e L and < Ag < 1}. As a second law (concerning again recording procedures), we postulate the axiom: AV 1 .2 g e L, e e G and X (e) c *C (g) implies K t (e) => ^(g). Let C(w) be the norm-closed face of K generated by w. Since ^ is separable, every norm-closed face of K is of the form C(w) with a suitably chosen w. Ludwig 205 The next law (concerning both preparing and recording procedures) is given by the axiom: AV2 w u w 2 eK and C{w l )^C(w 2 ) imply that there is a g&L such that w 2 eK (g) but wiiK {g). AV 2 is equivalent to the relation: KoL (F) = F for every norm-closed face F oiK. The axioms AV 1.1, 2 and AV 2 imply L=L =[0,1] In the theory, we need the following axiom AVid which may be regarded as a mere mathematical idealization: AVid Ki(e)*0 for all eeG with e#0. AV id cannot be tested by experiments! AV id is equivalent to: (1 - e) e G for all eeG. The mapping e-*l-e is an orthocomplementation in the lattice G. The axioms AV 1.1 through AV id imply that G is an orthocomplemented, orthomodular lattice. We define the following 'distance' between two closed faces of K: A(C(w 2 ), C(w 3 )) = k inf {fi(w, e 3 )\we C(w 2 )} +3inf{ i u,(H',€ 2 )|H'eC(H'3)} e, being an abbreviation for eL C(Wi). Two faces C(w 2 ) and C(w 3 ) are called strictly separated if A(C(w 2 ), C(w 3 )) # 0. The next law (concerning preparing procedures) is given by the axiom: AV3 If w u w 2 , w 3 eK with C(w t ) <= C(w 3 ) <= C&Vi +jw 2 ) and if the faces C(w 2 ), C(w 3 ) are strictly separated, we have C(wi) = C(w 3 ). All of the axioms AVI. 1, AVI. 2, AV2, AVid and AV3 hold for all (!) known theories of physical systems, even for the so-called classical theories. The axiom distinguishing between 'classical' systems and m/crosystems is the following: AV 4 For every face C(w) of K there is a sequence w v e K such that C(wJ is of finite dimension, C(w u+1 ) => C{w v ) and aw)=vawv) V It should be stressed that VV C(w v ) is not the set-theoretical union of the sets C(w„), but the smallest closed face of K which includes all C(w„). The concept of microsystem may now be defined as follows: Definition 6: If the axioms AV 1, AV 2, AV 3 and AV 4 hold, the set M endowed with the structure Si, 9t , 01 is called a set of microsystems. 206 Uncertainty Principle and Foundations of Quantum Mechanics Since for the particular case of 'classical' systems the axiom AV 4 does not hold we will call axiom AV 4 the 'law of microsystems'. 'Classical' physical systems can be denned as such systems for which all (!) C{w) are infinite- dimensional and all decision effects are commensurable [for the concept of commensurability see Section 5(B); other equivalent forms of axioms for classical systems are given by H. Neumann (1972, 1974a, b) it has been demonstrated in these references how to regain the T-space from preparing and recording procedures.] . It can be proved (see Ludwig 1970, 1972b; Stolz 1971, and Ludwig, in preparation b), that the axioms AV 1, AV 2, AV id, AV 3 and AV 4 are equivalent to the following statement: K and L can be identified with the base of the Banach space $(#fi, X 2 , • • •) and with the order interval [0, 1] of the Banach space »'(*i, #2, • • •) respec- tively, «(#!, X 2 , ■ ■ •) being the space of all sequences (W 1 , W 2 , . . .). where every W t is a self-adjoint operator of the trace class in the Hilbert space % such that I-tr ({W 2 ) 1/2 )«x>. The dual Banach space 0t'(X u . . .) can be identified with the space of all sequences (A 1; A 2 , . . .), any A t being a self-adjoint and bounded operator in X, such that sup, ||A,|| < 00. The canonical bilinear form is given by: ((Wu W 2 , . . .), (A lt A 2 , . . .)) = 1 tr (WAt) i K is the set of all sequences ( W u W 2 , . . .) such that =£ W t and £, tr ( W t ) = 1 . L is the set of all sequences (F u F 2 , . . .) such that =£/=;•< 1. The axioms imply that the X, are Hilbert spaces over the fields R (of real numbers) or C (of complex numbers) or Q (of quaternions). There are physical arguments to eliminate the cases R and Q. 4. ENSEMBLES AND EFFECTS IN QUANTUM THEORY The identification (given in Section 3) of K with the base of @(X U . . .) and of L with the order interval [0, 1] of #'(#i, • • •) makes ix possible to interpret the mappings </> and ./r (defined in Section 2, Definition 3) as mappings of St' into »(*!, . . .) and 9 into 3t'(X u . . .), respectively. <f&' is then norm-dense in the base K of »(X lt . . .) and $9 is *(»', $)-dense in L = [0, 1]. The norm-closed subspace of &'{X U . . .) generated by $9 was denoted by 3 (see Section 3). A more detailed characterization of 3> by axioms cannot be given here (see Ludwig, in preparation a). Nevertheless, it should be mentioned that one may formulate axioms in such a way that 3) becomes a set of sequences (Ai, A 2 , . . .), A, being a self-adjoint operator of a certain C*-algebra s£ t of operators in X,. Thus 3) becomes the real part of a C*-algebra. We would like to call the attention of the reader to the fact that the 'set of states' in the theory of C*-algebras has been denoted (in Section 3) by K (and not by K, as is usually done in the theory of C* -algebras). Ludwig 207 The set G of decision effects (as introduced in Section 3) may then be identified with the set of all sequences e = {E u E 2 , ■ ■ .) E t being a projection operator in X t . The special decision effects e, = (0, 0, . . . 1„ . . .) define the superselection rules. We shall refrain here from a detailed discus- sion. However, it is not difficult to see that it is very practical to investigate theoretically as well as experimentally the various 'sorts' of microsystems (characterized by the e,-'s) separately. Each sort is described in one particular Hilbert space. In this way, one obtains the 'usual' quantum mechanical formalism. We cannot do this here in detail. Instead we want to show that the sets St of preparing procedures and 9t of recording procedures may serve to elucidate some of the conceptions of 'usual' quantum mechanics. As a first example we treat a structure very similar to the famous Heisenberg uncertainty relation. As mentioned above, it is sufficient to discuss the case of one Hilbert space only. The following theorem holds (see Ludwig, 1970): There are two decision effects f?i and E 2 such that, for every ensemble W, at least one of the following inequality relations must be false: tr(W(£i-«iD 2 )^ where tr{W{E 2 -a 2 l) 2 )^ 6 a 1 = tr(WE 1 ), a 2 = tt(WE 2 ). This theorem is analogous to Heisenberg's uncertainty relation, formulated for decision effects. This theorem shows precisely, that the Heisenberg uncer- tainty relation has nothing to do with experimental errors (at least not in principle; see also the discussion in Ludwig, 1975), since in measuring the decision effects E x and E 2 only the two values one or zero may be obtained. We will simplify the following discussion by assuming <f>2t' = K and i]/2F=L (which is not essential). Then we can express the physical content of the above theorem as follows: There are two recording methods (i.e., it is possible to construct two recording apparatus) b^ and b™ with responses b a) and b m , respectively, such that ift(b#\ b^-Ex and iff(bo\ bo ) ) = E 2 are decision effects and such that, for every preparing procedure a, at least one of the two probabilities Ar(a n b$\ a n b m ), XAa n bf\ a n b (2) ) is essentially different from zero or one. One can make experimental efforts as strong as possible to construct preparing procedures, at least one of the recording procedures b m , b (2) will respond indeterministically even in case i(f(b%\ b m ) and tl>(b ( <?\ b (2) ) are decision effects. 208 Uncertainty Principle and Foundations of Quantum Mechanics Thus we may conclude that the Heisenberg uncertainty relation is a relation which concerns merely the possibility of constructing preparing apparatus. All the more Heisenberg's uncertainty relation does not tell us anything about the possibility of 'simultaneous measurement'. We will have to come back to this problem in the next chapters. The vagueness in some discussions of Heisen- berg's uncertainty relation arises from the fact that in most cases there is no clearcut distinction between preparing and recording procedures, since only the so-called 'ideal' measuring processes of the 'first' kind are discussed. These particular measuring processes are recording and preparing procedures at the same time; therefore, the Heisenberg uncertainty relation for the preparing procedure forbids the simultaneous recording of the values of position and momentum. However, the concept of simultaneous measuring can be denned in a very natural way without any recourse to the so-called ideal measuring processes of the first kind. We will do this in Section 5(A). 5. OBSERVABLES The concept of an observable can be defined in a very natural way starting from the experimental situation of recording, described by the terms $„ and 0t. (A) Coexistent recording procedures A pair (b b)e& was called an effect procedure. The mapping iff connects with each effect procedure (bo, b) an effect ^(b , b) e LC®Wn, ■ ■ • )• All discussions in Sections 3 and 4 were concerned with the mapping of one / = (b , b) onto one g = ^(/) = $(b , b) only. In reality, b represents in general an apparatus which has several possibilities of response, namely, all b with b <= b . Let us denote by m(b ) the set of all b with b <= b . The elements of the set m(b ) represent all recording procedures that are possible in applying the recording method b . In many experiments on microsystems, one uses methods b , where the set 0t(b o ) is so large that it is practically impossible to determine all b e ®(b ). This situation may be illustrated by the following two typical examples of recording apparatus. 1. A SYSTEM OF MANY COUPLED COUNTERS Each microsystem can produce responses of some of these counters. A recording procedure b can for instance be characterized by the answer of three particular fixed counters. In this case the set ®(b ) is very well known to technicians: ®{b is the Boolean algebra of switching between the various counters. Ludwig 209 2. The recording method b is a bubble chamber (or a cloud chamber) In this case the set of all b's is immense: every possible bubble corresponds to one b, but also every connected trace of bubbles corresponds to one b. The two examples demonstrate another feature of quantum mechanics: The various responses b e &(b ) of the apparatus b do not necessarily appear at the same time. In general a response b need not be instantaneous but may have a finite duration, for instance in the case where b represents the simultaneous response of two coupled counters responding with a time delay. Another example is a trace in a bubble chamber. These examples show that the 'simultaneous recording' of the various b's of ffl(b ) has nothing in common with 'measuring at the same time'. The fre- quently used formulation: Some observables as, for instance, position and momentum are 'not measurable at the same time' but at 'different times' is at least incomprehensible, if not false (see Ludwig, 1975). We define: Definition 7: The recording procedures b e&t(b ) are called coexistent with respect to the recording method b . Several (b , b)e& which have the same b are called coexistent effect procedures. (B) Coexistent effects If (b , b)isa family of coexistent effect procedures, then the set of the effects i//(b , b) is a subset of ^(b ). Therefore, we may define a mapping tf/ of0t(b o ) into L by Mb) = Hbo,b). It is not difficult to prove the following theorem: The mapping i/r is an additive and effective measure on the Boolean algebra 5?(Z> ), which maps the unit element of &t{b ) onto 1 e L. As an idealization we define: Definition 8: A set A el is called a set of coexistent effects if there is a Boolean algebra 1 endowed with an additive measure, F: 1-*L such that AcFI. The essential conception of coexistent effects has been defined (though not yet in a clearcut way) in Ludwig (1964, 1967a, b, c, d). This conception is fundamental for the notion of observables. Definition 9: A set A <= G (i.e. a set of decision effects) is called a set of commensurable decision effects if there is a Boolean algebra 1 endowed with an additive measure F:1->G such that A czFI. 210 Uncertainty Principle and Foundations of Quantum Mechanics It may be proved that every set of coexistent decision effects is also a set of commensurable decision effects and that the decision effects are commensura- ble if and only if the projection operators (in X t ) belonging to these decision effects commute. This last condition is very well known. However, the founda- tion of this condition is usually presented with much 'philosophy'. (C) Observables The notion of observable is nothing but an idealization of the correspondence 0t(b o ) -*L. This idealization is obtained in a process of completion (see Ludwig 1970 and in preparation a): If a Boolean algebra 2 is endowed with an additive and effective measure F: 2 -> L, then a metric may be defined on 2 by d(«r„ * 2 ) = fi(w , F{a x a af) + F{a 2 a af)) <r* being the complement of o~, w is an effective ensemble, for instance, w o = ^Kw v ,K>0, 2„A„ = 1, where the set {w v } is dense in K. 2 can be completed with respect to this metric (see Ludwig 1970 and in preparation a). Definition 10: A Boolean algebra 2 endowed with an additive measure F: 2 -*■ L is called an observable if 2 is complete and separable (with respect to the metric defined above). This general concept of an observable has been introduced and analysed in (Ludwig, 1970); a more detailed analysis is contained in (Neumann, H., 1971) and in (Ludwig, in preparation a). It is impossible to give a structural analysis of the concept of an observable in this short article. We only wanted to stress that the concept of an observable is no more than an idealization of recording methods. To show at least the connection between the notion of an observable denned in Definition 10 and the 'customary' notion, we add the following definitions: Definition 1 1 : An observable is called a decision observable if the mapping F of definition is a mapping into G. Definition 12: A mapping R -^ 2 of the set R of real numbers into a Boolean algebra 2 is called a measuring scale of 2 if a t > a 2 implies o-(ari) > o-(a 2 ) and if cr(-oo) = 0, er(+oo) = e (e being the unit element of 2) and if the set of all cr{a) generates the whole algebra 2. A decision observable endowed with a measuring scale is identical with what is 'usually' called an observable. 6. PREPARATORS So far, in the discussions of quantum mechanics only the simultaneous measurability of decision observables has been of interest. However, the Ludwig 211 question of the possibilities of simultaneous preparation was neglected or, at the most, discussed as a partial aspect of simultaneous measurability, since one had in mind only 'ideal' measuring processes 'of the first kind'. These 'ideal' measuring processes are, indeed, connected with a certain idealized form of 'repreparing' processes (see Ludwig, 1972a, 1975). At any rate many funda- mental questions of the interpretation of quantum mechanics will become much more transparent if the discussion of simultaneous preparation is sepa- rated from that of simultaneous recording. Within the scheme of quantum mechanics as outlined here, a quite natural question arises: what is the condition for simultaneous preparation. Equation (6) combined with the identification of K with a subset of $($fi, . . .) implies the theorem: Let a = U"=i 0i be a demixture of the preparation procedure a, then <f>(a)= t A,<£(0,) holds with A, = A 2 (a, a,), 0<A, < 1 and £, A, = 1. If there are two demixtures of the same preparing procedure (8) then we have i=i «c=i 4>(a)= I A,<^(a,)= I A k <l>(dk) i=i fc=i In this case, also, these two demixtures generate a third one, namely: a = {J'( ai na k ), (9) i,k where the union U' is taken over those pairs /, k for which a,niz k = 0. Expression (9) implies ^(fl) = r.A^(fl,n4) (10) where A, fc = A a (a, a, n^). It is very useful to introduce the following mapping <j> a : Sf(a) -*K;(K being the cap of the cone generated by K i.e. defined by the equation: K= U XK) <f> a (a) = \ i (a, d)(f>(a) It is not difficult to see that 4> a : y(a)—>K is an additive measure on the Boolean algebra y(a) with <f> a (a) = <f>(a) e K. 212 Uncertainty Principle and Foundations of Quantum Mechanics Definition 13: An element w e K with w < w e K is called a mixture compo- nent, or shortly a component of w. If w is a component of w, then so is w-w; and w = >v + (w-w) is a demixture of h\ Two demixtures of one and the same preparing procedure n m a = U fli= U flfc .=1 t=i yield two demixtures of the ensemble <f>(a): n m .=1 fc=i for which the components <£„(«.•) and <Ma fc ) are elements of the image of Sf(a) under 4> a . That leads to the following definition: Definition 14: Two demixtures of an ensemble weK n m v w = I h\- = I »v fc (w„ w fc e A") are called coexistent if there is a Boolean algebra 2 endowed with an additive measure W : 2 -> £ such that W(e) = w (e being the unit element of 2) and w,-, w k eWL. Two demixtures of one and the same preparing procedure a give two coexistent demixtures of the ensemble </>(a). Definition 15: A set A <= K is called a set of coexistent components of w if there is a Boolean algebra 2 endowed with a measure W : 2 -* K such that W(e) = wandA^U r L. The set 4> a na) = {<f> a (d)\deQ,d^a} is a set of coexistent components of 4>(a). If W : 2 -» AT is an additive and effective measure on the Boolean algebra 2, then defines a metric in 2. 2 may be completed in this metric and the measure W may be defined uniquely on this completion. This leads to the following idealization of <f> a : SP(a) -*■ K: Definition 16: A Boolean algebra 2 endowed with an additive measure W: 2-»A: such that W(e)eK and 2 is complete and separable is called a preparator. Ludwig 213 Let 2 be the completion of y(a) in the metric defined by the measure 4> a then 2.-*K is a preparator. Many experiments (especially in elementary particle physics) may be described by a structure of the form y(a) -4 K. This structure is also of paramount significance for a discussion of fundamental problems of quantum mechanics. We must refrain from a general mathematical structure-analysis of the concept of preparators [see for instance Ludwig, (1975) and in preparation a], rather, we will discuss some 'gedanken' experi- ments to elucidate the significance of preparators and coexistent and non- coexistent demixtures. First, we shall show that there exist non-coexistent demixtures. It is sufficient to prove this in the case of one Hilbert space X only. Let <f>, iff e X, ||4|| = M = 1, <t> -J-«A and 0< A < 1; then w = AP* + (1 -A)P„, is an ensemble. AP* + (1 -A)/% is a demixture of w, but it is easy to find others: Let x be another vector in the plane generated by <$> and i//. Then it is possible to choose a real number fi (0 < fi < 1) and a vector r\ e X such that is another demixture of the same w. Proof: x has the form x = <t>a + *b;\a\ 2 +\b\ 2 =l The conclusion is obtained by putting: A(l-A) M = A|6| 2 + (1-A)|a| 17 = -.[<p\b-^(l-\)a] VAW + (1-A 2 )k| If a ¥> and b # 0, the two demixtures w = XP 4 , + (l-\)P^ = fiP x + {l-fi)P v cannot be coexistent as we will prove immediately: Let 2 be a Boolean algebra endowed with an additive measure W such that there are er 1( <x 2 e2 satisfying W{a l ) = kP 4> , W{a 2 ) = nP x . That implies W(<rf)=W(e)-W(<r 1 ) = w-AP* = (l-A)Pi and W(o- 2 < ) = (1- I jl)P v . We want to calculate W{a l h(r 2 ). One has W(at a a 2 ) ^ W(ai) = AP* and W^o-! a <r 2 ) =£ W(<r 2 ) = /mP x . Since v e K and O^v^Pj, 0<v<P x implies v = 0, we get W(o-i a <r 2 ) = 0. In the same way, it follows that W(a 1 a o-f ) = 0, WV* a <r 2 ) = 0, and W(a\ i\<rf) = 0. Now e = {a 1 a <r 2 ) v la x a o-f) v (o-i a o- 2 ) v (o-f a af) implies w = W(e ) = WVi a er 2 ) + W(<n a o- J) + W-Vif a o- 2 )+ WV? a o-f) = in contradiction to w = AP* + (1 -A)P^ ^ 0. 214 Uncertainty Principle and Foundations of Quantum Mechanics 7. PROPOSITIONS CONCERNING EXPERIMENTS ON INDIVIDUAL MICROSYSTEMS So far, one is accustomed to speak about individual microsystems and about the results of experiments with individual microsystems using common (not mathematical) language. On the contrary, mathematical language is employed in classical mechanics of mass points, where every individual mass point is represented by a mathematical trajectory which is then compared with the 'real' trajectory obtained in measurement. Every proposition concerning an individual system may be formulated in mathematical form. The situation is quite different in customary quantum mechanics, where an individual experi- mental result, e.g. an individual trace in a cloud chamber, cannot be compared with the theory, since the usual quantum mechanical mathematical picture comprises only terms describing the statistics. For instance, it is not possible to translate into mathematical language such propositions as 'the position of this individual electron has been measured in the region V\ It is not possible to find a corresponding mathematical relation for this proposition. Only after a series of such measurements of the position have been carried out, the statistics of the results can be compared with the theory using the relation m =tr (WE T ). Here, E v is the following decision effect: the measured position is localized in V. One has tried out many ways of coping with this defect of quantum mechanics. Some approaches should be mentioned. Some physicists call the projection operators (or, in our terminology, the decision effects) properties of the individual microsystems, and formulate propositions such as: The microsystem x has the property e ; I know that x has the property e; x has the property 'not e'; x has not the property e; I do not know whether x has the property e ; x has the property e with the probability a ; the property e has been measured on x; etc. No wonder that one gets into difficulties in using this type of language. Some people have tried to avoid these difficulties in introducing a more precise form of this language and a new logic, too. Another attempt is interpreting the projection operators as propositions. In this case, the authors have to say which propositions (formulated in common language) should be represented symbolically by projection operators. Corre- spondingly some authors interpret the lattice of projection operators as a proposition-logic, the so-called quantum logic. This conception of quantum logic is to be distinguished from others, where the lattice of decision effects is formally called 'quantum logic', however, without claiming that the word 'quantum logic' should have anything in common with a logic of propositions. We want to show that every experimental result (also if concerning an individual microsystem) can be described by a mathematical relation if one accepts the foundations of quantum mechanics outlined in the first sections of this paper. This description is possible without employing any new form of logic and using only the usual logic of mathematics. Ludwig 215 It is not possible to give here a complete survey of all possible ways of formulating propositions on individual microsystems in mathematical lan- guage, only some examples may briefly be sketched. To avoid mathematical difficulties connected with the fact that <f>&' is only dense in K and tySF only dense in L, we will sharpen our axioms in postulating: 4>$L' = K and i//&=L. This sharpening is without any fundamental physical relevance, since every comparison between theory and experiment can only be made with a finite inaccuracy (anyway, this sharpening, may be avoided in using more sophisti- cated mathematics). For the following discussions it is essential to realize that the physical interpretation of l x e a, a e ,2" and of 'x e b, b e 0f has already been given in Section 1. All other propositions may be deduced from these two fundamental relations. For instance, a proposition of the form 'the microsystem x has the property e' makes no sense until one has given this proposition a mathematical form which is reducible to relations of the form x e a, a € St ' and x e b, b e £%. No intuitive meaning of properties is introduced. Every interpretation must be in the last line reducible to the interpretation of preparing and recording proce- dures. We define the following sets: Definition 17: For every e e G let St(e) = {a\a e St', and <^(a)e ^(e)} MJe)= U a ae2(e) We want to discuss the following relation: xeM p (e) (11) The physical interpretation of (11) is clear by Definition 17, since e is defined by preparing and recording procedures as has been done in Section 2, and M p (e) is the union of some preparing procedures. However, one usually wants to express relation (11) by a short formulation in common language. To this end, we introduce the following terminology: The elements e e G will be called pseudo -properties (not simply properties — to avoid misunderstandings). We express the relation (11) in the form: 'the microsystem x has been prepared with the pseudo-property e\ Relation (11), formulated for microsystems, may also be formulated for macrosystems and, in this context, it represents a very interesting technical procedure. For instance, for steel balls manufactured by a machine, it reads: 'this individual steel ball x has the property e, i.e., that the radius lies in the interval [r u r 2 ]'. Relation (1 1) is a proposition on a pseudo-property of the prepared micro- system x. No misunderstanding is possible if we use relation (11) for such propositions. The logical negation of (11) is xf£M p (e). There is another relation: xeMpie' 1 ), e i= l-e. However, the two relations x£M p (e) and 216 Uncertainty Principle and Foundations of Quantum Mechanics xeM p (e"-) are not equivalent! This fact has nothing to do with logic, but only reflects the structure of the family of sets {M p (e), eeG}. Another possible proposition on an individual microsystem may be obtained using the following definition: Definition 18: For every eeG.we define: 91 (e) = {b \b e 0i and there is a b e @, such that b => b and il/(b ,b) < e}; Mr(e)= U b be0l(e) We want to discuss the relation xeM r (e) (12) We express (12) in common language by the proposition: 'The pseudo- property e of the microsystem x has been recorded'. We define: Definition 19: M(e) =M p (e)uM r (e) The mathematical relation xeM(e) (13) is equivalent to xeM r (e) or xeM p (e) We express (13) in normal language by: 'The microsystem x has the pseudo-property e\ It seems again in order to stress that the two relations x 9k M(e) and x e M(e^) are not equivalent. At the beginning of Section 7, we mentioned some of the propositions usually used in the interpretation of quantum mechanics. These propositions have the disadvantage that it is often unclear what they mean, and various authors may give them various meanings. This is not possible while using relations (11), (12) and (13). What about a new logic to treat the relations (11), (12) and (13)! At first, it is to be stressed that in 'handling' mathematically relations (11), (12) and (13), one has to use customary mathematical logic! What does this imply? We will demonstrate this only by examples. Let, in the following, x be an individual microsystem, we are experimenting with. Let e be a well-defined element of G, for instance, the following decision effect: 'the position is in a certain space-region V. We will now discuss some possibilities of describing what we have done in experimenting with the microsystem x. (1). We have shown in Ludwig (1970, 1975) that experimental results and procedures may be written down in mathematical form. These relations representing the experiments are, from the point of view of mathematical Ludwig 217 theory, new axioms denoted in Ludwig (1970, 1975) by ( — ) r . We assume that these axioms ( — ) r are added to the mathematical theory and that, for instance, (13) is a theorem in this stronger mathematical theory. If this is the case, one says for instance: I know that x has the (pseudo-)property e. The words 'I know' could be misunderstood in the following way: someone could mean that my subjective knowledge is essential here. However, what we, indeed, mean is that the real experimental situation (expressed by the axioms ( — ) r together with the theory) make it possible to deduce (13) as a theorem. In the case where (13) is deducible as a theorem, also another sentence is used: The proposition (13) is 'true'. If one uses these words 'I know' or 'true' only as abbreviations for the fact that (13) is deducible as a theorem, there is no objection and no real mistakes are possible. (2). For instance, relation (12) is no theorem, however, it may be added without contradiction as an axiom to the theory complemented by the axioms ( — ) r . In this case, one may say: It is possible (but not necessary) to record the pseudo-property e on the microsystem x. Or: The pseudo-property e can be recorded on x. It is exactly this situation which, in the opinion of some authors, may be described only by a new logic. In fact, any mathematician knows that a relation like (12) may be added to a theory without any contradiction, although (12) is not a theorem of the theory. In this case, it is possible to add (without any contradiction) also the negation of the relation (12): x€ M r {e). If one wants to interpret the word 'true' by 'deducible as a theorem' (see above), then, in this sense, mathematical logic is 'many-valued' since the very beginning of mathematics. In a mathematical theory a relation A can be a theorem, or the relation (not A) can be a theorem, or (as a third possibility) neither A nor (not A) are theorems, but A as well as (not A) can be added as axioms without any contradiction (naturally, not both of these relations!). However, no new logical axioms different from the usual mathematical axioms are necessary to discuss quantum mechanics. (3). The negation of relation (12) is a theorem in the theory complemented by the experimental results ( — ) r . One says: It is impossible that x could be recorded with the pseudo-property e. It may be stressed again that x£M r (e) does not imply xeM r {e^)\ After having interpreted words such as T know', 'true', 'possible', 'impossi- ble', we want to discuss in more detail example (2). We assume, in particular, that the microsystem x has been prepared experimentally by a procedure a (but not yet recorded) and that the well-known construction of the preparing apparatus makes it possible to determine the element <f>{a) of K. Under this assumption, relation (12) may be added as an axiom without any contradiction if and only if fi(<f>(a), e) 5* 0. If this is the case, one sometimes states that relation (12) is true with probability n,(<f>(a), e). However, this statement could be misunderstood. In fact, the question as to the 'possibilities' of (12) is very complicated. Therefore, we ask: How is it possible to 'realize' the relation (12)? We ask for an 218 Uncertainty Principle and Foundations of Quantum Mechanics experimental possibility of a recording procedure, such that after writing down the experimental results of recording (in the form of axioms (—),.), relation (12) will appear as a theorem. However, this experimental possibility does not only depend on the probability, it also depends on a certain 'arbitrary choice'. Arbitrary is here the selection of the recording method b . xeb can be realized for many different recording methods b . It is up to the experimentalist which method b he will apply. If a certain apparatus b has been installed to be used in the experiment, the relation xeb (as one of the axioms of (— ) r ) is to be added to the theory. Now we have a situation very typical for microsystems: By the selection of a certain b , i.e. by writing down xeb for a certain b , the 'possibilities' have changed essentially. By b i.e. by ip ■ ®(bo) -> L an observ- able is defined. It may be that the decision effect e in (12) is not coexistent with all effects ${b , b). In particular, b can be chosen in such a manner that the following theorem holds: There is no be 91 such that b * , b c b , and if/(b , b)^e. If x e a and x e b is fixed by the experiment and if the theorem above holds for b, then relation (12) is false, i.e. xtM r (e) is a theorem in the theory with axioms x<=a,xeb . We say: it is impossible that x could be recorded with the pseudo-property e. However, if one has not yet fixed the recording method b , one is free to choose another one. For instance, it is possible to choose a recording method b such that there exists a b e m such that b <= b and i/r(b„ b) ^ e. Now suppose we have chosen such a b experimentally. What can we say then about the 'possibility' of relation (12)? It is clear that (12) may be added as an axiom without contradicting the theory containing the axioms, xeaandjte b , if there is a b e 5? (b ) such that tlf(b Q ,b)^e and ii(4>(a),^bo,b))^0. If there are b lt b 2 e0t(bo) such that <i{bo, bi) ^ e, iff(bo, b 2 ) ^ e, then also therelation 4>(b , b x ub 2 )<e holds, which can be simply proved: b 1 ub 2 = [b 1 n (b \b 2 )] u b 2 implies One has: i{/(bo, AiuW = <A(£o, br n (b \b 2 )) + tf>(b , b 2 ) g t = tl/(b , bi n (b \b 2 )) < t(t(b , b{)<e and g 2 = <lf(b , b 2 ) < e. If we write g = <K£>, &i u b 2 ), we get g = gi + g 2 . This equation and the relations gj < e, g 2 < e imply K (g) => K (e), which is equival- ent togse. To avoid measure-theoretical arguments, we will assume that the union of all elements, b of 9t(b ) such that 4t(b , V)<e, is an element b of &(b ). b is the greatest element of &(b ) satisfying ilr(b ,b)^e. Therefore, if b has been chosen as the recording method, we say: relation (12) is possible with probabil- ity n(4> (a), iff (b , b)). If b has been chosen in such a way that iff(b , h) = e, the probability is equal to the greatest possible value fi(<p(a), e). We now assume that the recording of the system x has been accomplished and that our b has given a response. Then the relation x e b must be added to Ludwig 219 the theory as an axiom, i.e. as a mathematical formulation of the experimental result. In this more comprehensive theory, relation (12) is a theorem; we say: 'the microsystem x has been recorded with the pseudo-property e'. Since (13) is a theorem, too, we also say after this experiment has been performed: 'The microsystem x has the pseudo-property e\ However, if b has not responded, i.e. if b \b has given a response, then x e b \b is to be added as an axiom. This implies that the relation x g M r (e) is a theorem; we say: 'it is impossible that the system x has been recorded with the pseudo-property e\ But x e M(e) can be a theorem if i//(a) e K x (e)\ Some authors are using intuitively the proposition 'x has the property e' as equivalent to relation (13). To others, this proposition seems inadmissible. We will see why. Let «!, e 2 be two incommensurable decision effects such that M(ei)n M(e 2 ) # (for instance, let e x be the decision effect: the position in a very small space-region V, and e 2 : the momentum is in a very small region II of the momentum space). M(ei)riM(e 2 ) 5* implies that the relation x e M(e x ) and x e M(e 2 ) (14) could be an allowed hypothesis (see Ludwig 1974b, and in preparation b), i.e. that in the case of a suitable experimental situation relation (14) may be added to the theory without any contradiction. It may also be that the experimental situation is such as to admit relation (14) as a theorem. At first sight, relation (14) seems to contradict the well-known quantum- mechanical laws, for instance, the Heisenberg uncertainty relation, if the regions denoted above by V and II are small enough. Relation (14) expressed in common language would read: The microsystem x has the pseudo-property ei (for instance, a position in V) and the pseudo-property e 2 (for instance, a momentum in II). Such a sentence in common language could, indeed, seem to be in contradiction to quantum mechanical laws. However, in reality, there is no contradiction. We may clear up this problem in formulating it in mathemati- cal symbols. Relation (14) is, for instance, a theorem if the experiment is of such a type that xea, <b(a)e K^ex), xeb, b being such that there is a b => b and ilf(b ,b)<e 2 . Another objection to accepting the proposition 'x has the pseudo-property e' as a translation of (13) into common language is the following: Let aeSbe such that <f>(a) e K x (e) and xea. Then (13) is a theorem, even if the recording procedure b is such that ij/(b , b) is not coexistent with e. After (!) the recording method b had been employed, the microsystem x was influenced by the apparatus b in such a way that the microsystem does not have the pseudo- property e any more. This objection is not correct. It is true that there was an influence of the recording apparatus on the microsystem. However, it is not essential for the interpretation of quantum mechanics to know what happens after the recording procedure. Only the correlation between preparing and recording are essential. Why do many physicists believe that a measurement 220 Uncertainty Principle and Foundations of Quantum Mechanics (we called it a recording procedure) yields an information on the system after (!) measurement? Only because they tacitly assume the so-called 'ideal measure- ment of the first kind'! However, in general, the recording is much more complicated, considering the influence of the recording apparatus on the microsystem see Ludwig (1972a, 1975). In our interpretation of quantum mechanics, the recording response yields an information on a microsystem such as it was before the recording process has materialized. The essential relations x e a (a e St') and x e b (b € 0t) are relations which give an information on the system x after preparing and before recording. The probability /x, (<f>(a), iff(b , b)) gives the correlation between preparing and recording processes, a correlation due to the so-called microsystem conceived as a system between preparing and recording. Instead of our description of the basic quantum-mechanical processes, given by the structure St, & , ®> a lot of different postulates for the measuring process are considered to be fundamental by other authors. The majority of these postulates have their origin in the famous postulate (M) in J. v. Neumann's book Mathematical Foundation of Quantum Mechanics (1955). This postulate (M) reads: (M.) If the physical quantity, R, is measured twice in succession in a system, S, then we get the same value each time. This is the case even though R has a dispersion in the original state of S, and the .R -measurement can change the state of S. V. Neumann's postulate (M) was the source of the widespread opinion as if it were essential, in a measurement, to fix the state after the measurement. The existence of such measurements of the first kind, as they are postulated by (M), is not necessary for the interpretation of quantum mechanics; the structure Si, m Q , 0t is sufficient, moreover this structure St, 0t o , 0t seems to represent a more natural basis for quantum mechanics considered as a theory describing microsystems between preparing and recording. The description of a microsystem as a real entity between preparing and recording should not be taken as a hint to endow such microsystems with additional intuitively motivated structures. Such additional (read: not deduci- ble from St, 9t , 0t) structures could eventually prove to contradict quantum mechanics [see, for instance, the theories of hidden variables, see also Ludwig (1975)]. The examples (11) through (14) should suffice to demonstrate that, on the conceptual basis given in Section 2, it is possible to speak of individual microsystems without any contradiction. 8. THE EINSTEES-PODOLSKI-ROSEN PARADOX We will discuss only the spin example of the Einstein-Podolski-Rosen paradox (in the following abbreviated as the EPR paradox). Ludwig 221 Pairs of particles may be prepared in such a way that the total spin is equal to zero; the spin of each single particle of the pair may be equal to 1/2. To describe the EPR paradox, one needs three Hilbert spaces, %6 t : Let $fi be the Hilbert space of the particles of sort 1, $f 2 the Hilbert space of the particles of sort 2 and $f 3 the Hilbert space of the pairs (1, 2). To simplify the problem we shall consider only the spin spaces, i.e. %Cx and $f 2 are both two-dimensional and Of QM> w qu> (flS — <fv\ <*■ efV2 We install a preparing apparatus a which prepares only pairs of total spin zero. The ensemble 4>(a) is described by a statistical operator in $f 3 = 5Jf x x $f 2 . We have <j>(a) = P x , x being the vector = -^=(h + (1)«-(2)-«-(1)« + (2)) We have denoted the eigenvectors of the 3-component of the spin of particle 1 by u+(l), «_(1), and the corresponding eigenvectors of particle 2 by u+(2), «_(2). Let 6o 2> be a recording method that measures the 3-component of particles 2. Then, there is a recording procedure Z> <2) <= b™ such that ip(bo\ b (2) ) = 1 x P u+ , and ilf(b$\ ^ 2) \6 (2) ) = 1 xP u _ (as operators in X 3 ). Now we want to construct a new preparing apparatus composed of the apparatus a and the apparatus b™. We get three new preparing procedures, preparing particles of sort 1. We do this in the following way (see Figure 1): The pairs prepared by a are leaving the preparing apparatus in such a manner that the particles of sort 1 and © Figure 1 Preparing apparatus composed of apparatus a and b™ 2 leave the apparatus in opposite directions. Particles of sort 2 are entering the recording apparatus b™. The particles of sort 1 are leaving the new apparatus composed of a and b™. The apparatus composed of a and b™ gives us three preparing procedures: The preparing procedure a\ comprises all particles of sort 1, leaving a. The preparing procedure a\ + comprises all particles of sort 1 such that b (2) has given a response. The preparing procedure a\~ comprises all particles of sort 1 such that Z»o 2> \^ <2) has given a response. 222 Uncertainty Principle and Foundations of Quantum Mechanics Apparently, the following relations hold: ^ 1 + 3- a\ + na\ = 0, i.e. a\ = a\ + u a\~ is a demixture of a\. <f>(a\), <t>{a\ + ), <f>(al ) are operators in <j>(a\) = W=\\, <f>(al + ) = P u _, 4>(a\-) = P u+ Since \ & (a\,a\ + ) = k A {a\,a\') = \ the demixture a\ = a\ + \ja\~ leads (according to Section 6(A)) to the demix- ture of the ensemble W: W=<f>(al) = l<f>(al + )+l<l>(a 3 l ~), i.e. W=\P U _+\P U+ It is not difficult to construct many more preparing procedures in a similar way. We combine the preparing apparatus a with a recording apparatus b that measures the 1 -component of the spin of particles of sort 2. Then we get three preparing procedures a\, a\ + , a\~ such that <£(« i + ) =P V -, 4>(&\ ) = Pv+, v+ and v- being the eigenvectors of the 1-component of the spin. a^ = a \ + u a \" is a demixture of a \, which implies <l>(aX)=W=\\ = \P v MP v+ Since <j> {a \) = 4>(a \), we have obtained two different demixtures of the same W. Since the two preparing procedures, a\ and a\, differ only in the measuring parts, one could mean that we should postulate: a\ = a\. However, although the postulate a \ = a \ seems very 'plausible', at first sight, it leads to a contradic- tion. Indeed, we will prove: a\r\a\=0, which contradicts the postulate a\ = a\, since a\, a\&SL', i.e. a\* 0, a\*0. Hence a paradox (i.e. a con- tradiction) arises if one attempts to add to the theory the (perhaps plausible) axiom: a\ = ai. To prove a\ n a\ = 0, we start with the relation a\na\ = {a\ + na\ + )u{a\ + na\-)u(a\- na\ + )u{a\- na\-) Since 3. is a selection structure, a{ + na\ + &£ holds. If a\ + r\a\ + £&' (i.e. a\ + na\ + 7 i were true, <f>(a{ + na\ + ) would be an ensemble. The two demixtures a 3 + = (a 3 +na i +)ua -3 + and a {+ = ( a \ + na\ + )KjaY where dl + = al + \(al + na\ + ), d\ + = a\ + \(al + nal + ) imply the demixtures ^(aD = A^(a? + nal + ) + (l-A)^(aD I Ludwig 223 Since <f>{a\ + ) = P U _ and <(>(a\ + )P v _, the demixtures take on the form: P u „ = kW' + (l-\)W" and P v - = fiW' + a-fi) W" a\ + r\a\ + ¥^0 would imply A # and fi ¥= 0. Since P u _, P v _ are extreme points, A # and /j.^0 would have the consequence: W = P u _ and W = P„_, which is a contradiction. Thus a\ + n a \ + = is proved. In a similar way, a \ + n a \~ = , etc, can be proved. Thus also a \ n a J = is proved. It is a consequence of a i n a 1 = that the two demixtures W = §P„_ + iP« + = \P V _+\P V + of W=\\ are not coexistent. The example, <^(ai) = <^(al), shows an essential feature of quantum mechanics, namely, the fact that something of the structure of the preparing procedures gets lost by the mapping </> of SI' on to K. The introduction of the structure SL, 0t o , 01 as a basis for the interpretation of quantum mechanics is essential to clear up all problems of the interpretation. The example above is an 'extreme' case of two non-coexistent preparing procedures. We define: Definition 20: Two demixtures of an ensemble w e K i k (\Vi,w k eK) are called complementary if the assumption <p(a) = <p(d) = w together with two demixtures a=\_} i a i , d = [J k d k , such that <p a (Oi) = Wh <Pd («fc) = w k , implies and = 0. Example: the two demixtures W = \l = \P u MPu + = \PvMP» + are complementary. Definition 20 may be generalized in such a form that only abstract Boolean algebras (instead of 2(a), SL (a)) are used (see Ludwig, in preparation b). Another possibility of formulating the intuitive idea that a\ and a\ should not 'essentially' differ would be the following: It is true that not every preparing procedure aeS.' can be demixed 'arbitrar- ily' by constructing a corresponding apparatus. However, one might imagine that, ideally, there should exist much more and finer 'preparing procedures' than those of Si' . For instance, it is conceivable that procedure a could be demixed in such a way that the spin-3-components of the components of the demixture are well-determined. According to this point of view, we shall now attempt to add new axioms to the theory: Let SI be the set of all 'imagined preparing procedures'. We postulate I =>£ and all the axioms APS 1 through APS 7 also for the set §L. naturally, it would make no sense to postulate the other axioms AV 1 through AV 4 also for SI, since we assume that there are possibly more imagined preparing procedures than those of SL. In the same way as SL, SI also splits into 224 Uncertainty Principle and Foundations of Quantum Mechanics equivalence classes. Since the effect procedures / e & are 'tested' with the help of a larger set of preparing procedures, namely, those of 2, the partition of ^ into equivalence classes with respect to 2 will, in general, be finer than that of & with respect to 2. According to the idea that the elements of G are symbols for some properties which the microsystems are endowed with, we want to express that an fe &, such that i}/(f)&G, characterizes only the fact that the microsystem has been found to 'have' the property 4>(f). Therefore, we postulate the axiom: M(fl,/i) = l, ael (not only az2\), and fffifi) = «A(/i) g G implies n (a, / 2 ) = 1 In this axiom, we express the idea that the difference between the two effect procedures, f u f 2 , is not essential if the microsystems recorded by these procedures 'have' the property i//(fi) = *l>(f 2 ). K an 'imagined preparing proce- dure' involves only systems endowed with the property e = t(t(fi) = ^(A), then the probability must be one for f u as well as for f 2 . All axioms introduced for the set 2 are satisfied if one puts 2=2. They cannot lead to any contradiction. The essential difference between 2 and 2 shall be a much wider possibility to demix preparing procedures. According to the experiments of the EPR-paradox the following axiom seems to be very 'plausible'. To every demixture of a recording method, b e 0l o , in the form b = b 1 'ub 2 vb 3 (i.e. b t e 01 and b t n b k = if iV k), such that ^(b ,b 1 ) = e 1 eG, tl/(b ,b 2 ) = e 2 eG, tf>(b ,b 3 ) = e 3 eG hold there exists, to every^ae^' (not necessarily a &§.'), a demixture a = a"iU0 2 u a 3 such that a, e 2 and fi(d h (b , 6,)) = 1 (if a, ^0). This 'demixing' axiom means the following: every set of a microsystems prepared in a 'normal' preparing procedure a may be thought to be demixed with respect to every triplet of decision effects e u e 2 , e 3 , satisfying e t + e 2 + e 3 = 1, in such a manner that the microsystems of the components a, 'have' the same property e,-. The axioms introduced above allow us to define a mapping $ of G into $>(M) by: *(«) = U a aea,(e) where 2~ x {e) = {a\az£' and n(a,f) = l if <ft(f) = e}. It may be proved (see Ludwig, 1975): (1). e x < e 2 implies <b(e x ) <= <J>(e 2 ); (2). <fr(e)n<D(<; x ) = 0; Ludwig 225 (3). e, e G and e x + e 2 + e 3 = 1 implies <&(ei) u $(e 2 ) u ^(e 3 ) = M \ These relations lead to a contradiction (as can be proved by the methods given in (Bell, 1966; see also Ludwig, 1975). Therefore, the idea of a set I of 'imagined preparing procedures', such that the 'demixing axiom' holds, is forbidden, even though this idea might seem very plausible. 9. CONCLUSIONS We have presented some examples to demonstrate the following: (1). How one may work with the theory founded in Sections 1 and 2. (2). How to make precise theoretical propositions concerning individual microsystems, (3). How to formulate additional 'imagined' structures by axioms — with the intention to test these additional structures on their compatibility with the theory. We have shown that there is no difference between classical systems and microsystems as to the formulation of propositions concerning individual systems. The reason for the difference between classical systems and microsys- tems is only a different form of the statistical laws, due to axiom AV 4. ACKNOWLEDGEMENT Thanks are due to Professor Jenc, Dr. Kanthack, Professor Melsheimer and Professor Neumann for critical reading of the manuscript and numerous improvements in the text. REFERENCES Bell, J. S. (1966) 'On the problem of hidden variables in quantum mechanics', Rev. Mod. Phys., 38, 3, AM-AS1. Ludwig, G. (1964) 'Versuch einer axiomatischen Grundlegung der Quantenmechanik und all- gemeinerer physikalischer Theorien', Z. Physik, 181, 233-260. Ludwig, G. (1967a) 'An axiomatic foundation of quantum mechanics on a nonsubjective basis', in Quantum Theory and Reality, Springer, Berlin, pp. 98-104. Ludwig, G. (1967b) "Attempt of an axiomatic foundation of quantum mechanics and more general theories II', Commun. Math. Phys., 4, 331-348. Ludwig, G. (1967c) 'Hauptsatze des Messens als Grundlage der Hilbert raumstruktur der Quantenmechanik', Z. Naturforsch., 22a, 1303-1323. Ludwig, G. (1967d) 'Ein weiterer Hauptsatz des Messens als Grundlage der Hilbertraumstruktur der Quantenmechanik', Z. Naturforsch., 22a, 1324-1327. Ludwig, G. (1968) 'Attempt of an axiomatic foundation of quantum mechanics and more general theories III', Commun. Math. Phys., 9, 1-12. 226 Uncertainty Principle and Foundations of Quantum Mechanics Ludwig G (1970) 'Deutung des Begriffs "Physikalische Theorie und axiomatische Grundlegung der Hilbertraumstruktur der Quantenmechanik durch Hauptsatze des Messens', Lecture Notes in Physics, 4, Springer, Berlin. . , Ludwig G (1971a) 'The measuring process and an axiomatic foundation of quantum mechanics , Foundation of Quantum Mechanics, B. D'Espagnat ed., Academic Press, New York, pp. 287—315 Ludwig, G. (1971b) 'A physical interpretation of an axiom within an axiomatic approach to quantum mechanics and a new formulation of this axiom as a general covering condition', Notes in Math. Phys., 1, Marburg. Ludwig, G (1972a) 'MeB- und Praparierprozesse', Notes in Math. Phys., 6, Marburg. Ludwigi G. (1972b) 'An improved formulation of some theorems and axioms in the axiomatic foundation of the Hilbert space structure of quantum mechanics', Commun. Math. Phys., 26, Ludwig, G. (1974a) 'Measuring and preparing processes', Lecture Notes in Physics, 29, Springer, Berlin, pp. 122-162. Ludwig, G. (1974b) Einfuhrung in die Grundlagen der Theoretischen Physik, Vol. 1, Bertelsmann, Diisseldorf. Ludwig, G. (1975) Einfuhrung in die Grundlagen der Theoretischen Physik, Vol. 3, Bertelsmann- Vieweg, Diisseldorf- Wiesbaden. Ludwig, G. (in preparation a) Fundaments of Quantum Mechanics, Springer, Berlin, 2nd ed. of Die Grundlagen der Quantenmechanik (1954). Ludwig, G. (in preparation b) 2nd ed. of Ludwig (1970). Nagel, R. J. (1974) 'Order unit and base norm spaces', Lecture Notes in Physics, Vol. 29, Springer, Berlin, pp. 23-29. . , Neumann, H. (1971) 'Classical systems and observables in quantum mechanics , Commun. Mam. Phys., 23, 100-116. . . Neumann, H. (1972) 'Classical systems in quantum mechanics and their representation in topological spaces', Notes in Math. Phys., 10, Marburg. Neumann, H. (1974a) 'On the representation of classical systems', Lecture Notes in Physics, Vol. 29, Springer, Berlin, pp. 316-321. ? Neumann, H. (1974b) 'A new physical characterization of classical systems in quantum mechanics , Int. J. Theoret. Phys., 9, 225-228. v. Neumann, J. (1955) Mathematical Foundation of Quantum Mechanics, Princeton University Press. Stolz, P. (1971) 'Attempt of an axiomatic approach of quantum mechanics and more general theories IV, Commun. Math. Phys., 23, 1 17-126. Quantum Mechanics of Bounded Operators THALANAYAR S. SANTHANAM Institute of Mathematical Sciences, Madras, India 1. INTRODUCTION Heisenberg (1925) started the golden age of modern quantum mechanics. The essence of his discovery has been in the identification of physical observables in terms of Hermitian matrices (operators) leading to the fact that operators corresponding to canonical variables do not commute. The 'uncertainty' in the simultaneous measurement of canonical variables is then a simple manifesta- tion of this non-commuting behaviour of the corresponding Hermitian operators. Weyl (1931) rewrote the Heisenberg commutation relation in an exponential form leading to a certain nilpotent Lie group. Von Neumann (1931) and Stone (1930) simultaneously solved the problem of uniqueness for Weyl commutation relations. The Weyl group has a single irreducible rep- resentation which is precisely the Schrodinger representation. The complete equivalence of the Heisenberg and the Weyl commutation relations has been established (Rellich, 1946; Dixmier, 1958; Nelson, 1959; Carrier, 1966). The canonical commutation relation (CCR) of Heisenberg implies that the Hilbert space on which these operators act is infinite dimensional and that one (or both) of these operators should be necessarily unbounded. A natural question arises whether one can write an analogue of CCR for operators with a discrete bounded spectrum and acting on a finite dimensional vector space. The answer is yes. To achieve this, we start with the Weyl form (Weyl, 1931) of the representations of the Abelian group of unitary rotations in ray space. We define the canonical generators (Hermitian) as the logarithms of the finite dimensional unitary rotations. We then compute explicitly the commutator of these generators. It is naturally trace free. Also, we demonstrate that in the limit of continuous spectrum, valid as the dimension goes to infinity, the new commutator reduces to the standard CCR. Thus, our relation is the correct discrete analogue of CCR and it is unique if one starts with the Weyl form. We elevate the commutator for bounded operators with discrete spectrum to what we call quantum mechanics on discrete space (QMDS) (Santhanam and Tekumalla, 1975). We show that the angular momentum operators satisfy QMDS with the corresponding phases. To demonstrate this, we reformulate the theory of 227 228 Uncertainty Principle and Foundations of Quantum Mechanics angular momentum by quantising the azimuthal direction instead of the zenithal angle (Levy-Leblond, 1973). It turns out that there is no 'uncertainty' in the measurement of canonical variables (canonical in the sense of QMDS). The uncertainty in the measure- ment of usual canonical variables (CCR), in our formulation, is a manifestation of the continuous nature of the spectrum. QMDS is, however, distinct from the classical theory since the operators do not commute. There is an approach due to Schwinger (1960) to discuss quantum mechanics in finite dimensions and eventually take suitable limits to get the usual one. We will make use of this technique. Besides, the representation theory of generalized Clifford algebras has been studied by Ramakrishnan and coworkers (1969a, b). We shall make use of some of their results. 2. WEYL'S FORM OF HEISENBERG'S RELATION Suppose A and B are two elements of the Abelian group of unitary rotations in ray space so that AB = eBA (1) where e is the primitive nth root of unity. By iteration we have A k B' = e kl B l A k , jfe,/ = 0,l,...,n-l (2) From this equation it follows that A " commutes with B and B" commutes with A and if the Abelian rotation group is irreducible it follows from Schur's lemma that A"=/, B n = I (3) where / is the (n x n) unit matrix. The order of any element of an irreducible Abelian rotation group in n dimensions is consequently a factor of n. In the diagonal representation for B, i.e. £ = diag(l,e,£ 2 ,...,£ n ~ 1 ) (4) the matrix A has the form of a cyclic permutation matrix A = 1 . . 1 . . 1 . . . . 1 1 . .. (5) I (6) Santhanam 229 The action of A and B on the components of an n -dimensional vector is then A: JCfc = Xk+i B:x' k = e k x k More generally, A x k — x k+s B': x' k = e k 'x k (7) s, t = any integer The transition to continuous groups is now carried out. Following Weyl (1931) we set A=e i(P B = e ir, ° (8) where £ and rj are real infinitesimal parameters and we pass to the limit n -» oo. e has therefore to be identified as which yields We see that Since s = e 2OT/ " = e iiv ngri = 277- S' = e""°^e ,TO , r = r,t e k, = Q Unkt = e iikT (9) (10) (ID (12) (13) the eigenvalues of Q are given by q = £k mod ni- where k runs through all integral values. As ii^ = 2tt/t], by choosing 77 infinitesimal, we see that q may assume all real numbers from -00 to +00. In identifying x k =J&K€k), fr=q (14) where if/(q) is an arbitrary function satisfying the normalization jl</K<7)| 2 d<? = 1 (15) We find then that the quantity e' T " is represented by the linear operator *(«)-► e'-Vte) (16) 230 Uncertainty Principle and Foundations of Quantum Mechanics and the operator representing e" Tp *foWfo + <r) (17) From equation (17) it is clear that & icrP acts as the translation operator. If these linear operators are infinitesimal we have q:8if/(q) = q*l>(q) /v Id (18) Thus, it follows that the Schrodinger representation (wave equation) is a necessary consequence of Weyl's commutation relation. To summarize we have the theorem (of von Neumann) as follows. Theorem. Let U(€) = e iiP and v(-n) = e ir, ° be one parameter continuous unitary groups on a separable Hilbert space X satisfying the Weyl relation U(£)v(r,) = e ,en v(Ti)m€) (I 9 ) Then there are closed subspaces %6 t so that N (1). #f= © W t (N is a positive integer or oo) (2). U(i):X, + X, »(»}): ar,->a!i forall£i7eR (3). For each /, there is a unitary operator T,: #f 7 ->L 2 (R) such that TtUffllT 1 is translation to the left by f and TMv^ 1 is multiplication by e ,T,x . It also follows from the theorem that if P and Q denote the generators of U(€) and v(rj), respectively then there is a dense domain £><=#? so that (a) P: D-+D Q-.D^D (b) [Q,P]<f> = i<f>, for all 4 eD. and (c) P and Q are essentially self adjoint on D. Thus the Schrodinger representation is the only representation of CCR. It is not difficult to see that a pair P, Q of self-adjoint operators satisfying the canonical commutation relation [Q, P] = il cannot both be bounded. If they were bounded then PQ"-Q"P = -inQ nl and thus and hence nIlQ-l^llPMlQll" 2\\P\\\\Q\\*n ! : Santhanam 231 for all n which is a contradiction. Therefore, either P or Q or both must be unbounded. This can also be seen by simply taking the trace on both sides of CCR. Mackey (1949) replaced v{ri) = e ,17 ° by its spectral measure E such that v(v) = \^ dE(x) The measure E and U{tj) is an imprimitivity system for R based on R and the uniqueness theorem is then a consequence of the imprimitivity theorem. 3. A BASIS IN FINITE DIMENSION In this, we essentially follow the method of Schwinger (1960) in the construc- tion of a unitary operator A given by the cyclic permutation matrix. The action of A in Dirac's notation is where we identify (a k \A=(a k+1 \, A: = 1,2,. (a \ = {a | since A"=I The eigenvalues of A are then the n roots of unity v' = e k , k = 0,l,2,...n-l e = e 2 " v " (20) (21) (22) (23) The Sylvester matrix defined as 5 = v^ l : l i 1 e 1 e r 1 e 2n-2 (24) SS + = S + S = I diagonalizes any circulant matrix and in particular the cyclic permutation matrix A. Hence S^AS^B (25) 232 Uncertainty Principle and Foundations of Quantum Mechanics where S = diag(l,e,e 2 ,...e"- 1 ) (26) Thus any two operators connected by the transform equation (25) have as their eigenvalues the n roots of unity and if we denote their eigenvectors (of A and B) by |e*> and \e l ), respectively, then (e'\e k ) = S kl (27) = —=e 2m/n-kl vV In fact, we have the Theorem of Schwinger. Theorem. The basis of any finite dimensional vector space can be mapped by a unitary transformation (Sylvester transform) to a basis furnished by the roots of unity. Suppose we have two unitary operators U and V satisfying VU=eUV e" = l (28) £/" = V=7 Then the n 2 operators defined by 1 X kl = -=U k V l , k, I = 0,1,2, n-1 (29) are linearly independent. All X kl except the unit operator are traceless and with the multiplication defined by equation (28) form an associative algebra c 2 . We shall later discuss the representations of the generalized Clifford algebra c n m . Suffice it now to say that c 2 is isomorphic to the matrix ring M nXn . Thus, the operators defined in equation (29) furnish a unitary operator basis and this fact has been particularly used by Schwinger (1960). Now consider an arbitrary Y. We notice that I X kl YXl, =-ZU k V'YV-'U' k k,i n k,i =-1v k u'YU-'\r k (30) n k,i It is easy to show that this operator commutes with U and V. Hence where I X kl YXh=rI k,l=0 r = TiY (31) I Santhanam 233 We refer to U and V as a complementary pair of operators. From equation (3 1) it follows that u k v l uv-'ir k = e'u u k v l v\r l u l = e~ k V (32) which exhibits the unitary transformations that produce only cyclic spectral translations. If Y is an arbitrary function of U and V we see from equation (3 1) that 2lF(e'i7,£- k V) = -TrF J. n"k~.i (33) which is a kind of ergodic theorem. The operators V 1 and U k ,l, k = 0, 1, 2, ... n - 1 will satisfy the same operator relation as V and U, viz. V l U k =e 27Ti/n U k V' provided kl = 1 mod n with the unique solution given by the Fermat-Euler theorem / = Jt* ( " )-1 modn (34) (35) (36) where <j>(n) = the number of integers less than and relatively prime to n. The pair of operators U k , V 1 are also complementary. Suppose now we write n = itin 2 (37) where the integers are relatively prime. Then we can rewrite 2-ni/n-k 2mk l /n l 2mk 2 /n 2 (38) with k = k 1 n 2 + k 2 n x k x = 0, 1,2, ...«!- 1 k 2 = 0, 1,2, ...n 2 -l Thus a single basis defined by e k can be written as k fc ) = l4 1 >k2 2 ) where «?i and e 2 are of periods «i and n 2 , respectively. Then the single pair of complementary operators can be replaced by two pairs satisfying (39) (40) V 1 U 1 = e 1 U 1 V 1 (41) f 234 Uncertainty Principle and Foundations of Quantum Mechanics and where with V 2 U 2 = e 2 U 2 V 2 U 1 =U n \ V!=V' U 2 =W\ V 2 =V> Un, l x = „+ ("I*- 1 mod «! / 2 = „f("2)-i mod „ 2 The two pairs commute with each other. The basis now becomes X kl -*X klll , k2l2 = -±=U k l 'U k 2 >V l 1 1 V l ? where Or with ki,h = 0,l,2,...ni-l k 2 ,l 2 = 0,l,2,...n 2 -l Xu — n x kjtj ■A-kfy I — *-> i r I (42) (43) (44) (45) (46) (47) *„/, = (), 1, 2,. ..ny-l In general, since any integer can be written in terms of primes n={[v, (48) where / is the total number of primes including repetitions. The resulting factored basis is then where x(ki)=nx(k,,i,) ylv, kj,lj=0,\,2,...,Vj-\ (49) (50) Santhanam 235 In the particular case of v = 2, the complementary pair of operators anticom- mute and the basis forms the Clifford algebra. In the next section we shall study the structure of generalized Clifford algebras. 4. GENERALIZED CLIFFORD ALGEBRAS The problem that Dirac (1926) faced was to linearize X 2 i+xl + ...+X 2 m ={ t OiiX,) and consequently the a 's satisfy a j a j = -a j a i , i¥=j «? = / i,j = l,...m The set of elements defined by m a=n«?'> A 1 = 0,1 (51) (52) (53) which are 2 m in number are linearly independent and with the product defined by equation (52) form an associative algebra which is the familiar Clifford algebra c 2 m . In this case, it is well-known Boerner (1963) that when m = 2v = even, there is a single irreducible representation of dimension 2" and c!„= M 2 "x2"- When m = 2v + 1, CL+i = C\ v + CL where the elements of the second are simply the negatives of the first. The elements defined by an = [«,, «,] (54) satisfy the algebra 0(m + 1) of the orthogonal group in (m + 1) dimensions. In fact, they furnish the spinor representation of 0(m + 1). A natural question arises whether one can solve the linearization :=(!w)" x"+x 2 + ...+x The answer is yes if the e's satisfy the ordered commutation relation i,j = l,2,...m E = e M/n e" = l (55) (56) 236 Uncertainty Principle and Foundations oi Quantum Mechanics The basis defined by m T, = 0,l,2,...n-1 (57) which are n m in number are linearly independent and with the product given by equation (56) form an associative algebra called the generalized Clifford algebra C n m . Morinago and Nono (1952), Yamazaki (1964) and Morris (1967) have studied the algebra in detail and Ramakrishnan and co-workers (1969b) have studied exhaustively the particular realizations and their connections with physical problems. Sufficient for our purpose to state the following theorem. Theorem: When m = 2v, the algebra C m has one irreducible representation of dimension n" and C2„» M n ^ n -, the matrix ring of dimension n". When m = 2v + l, C n 2v+X = C n 2v +C± v +... + C5„ (n copies). The elements of the^ sec- ond, third, etc., are obtained from the first by multiplication with e',i = l,2,...n-l. The explicit realizations can be obtained either by a straight forward extension of the method of Brauer and Weyl (1935) or by using the method of Ramak- rishnan (1967) or by Ramakrishnan and co-workers (1969) an extension of Rasevskii's method (1969b). It should have become obvious by now that the algebra satisfied by the canonical pair of unitary operators U and V of the last section is simply C 2 =M nXn . Since in the basis X kh all except the unit element are traceless, suitable linear combinations furnish a Hermitian basis and thus give the self representation of the group su (n). (Ramakrishnan and co- workers, 1969a). 5. COMMUTATOR IN FINITE DIMENSIONS We now start with the Weyl algebra. We have seen that it has a single irreducible representation (which in the limit of infinite dimensions is in fact, the Schrodinger representation) in finite dimensions. We define then the generators as the 'formal' logarithm of the Weyl operators and we solve for them. Then we compute their commutator. We then show in the limit of infinite dimensions it reduces to the one of Heisenberg. We start with the operators V{g) and U(tj) satisfying Weyl algebra VU = eUV, 2m e = exp (58) V" = U"=I Santhanam 237 We have seen that the single irreducible representation of equation (58) in finite dimensions is given by [/ = diag(l,e,e 2 ,...£" _1 ) We define the Hermitian operators P and Q by (59) (60) where we have chosen f = 17 = ^2ir/n. Then, formally Q = -i\l — log U (61) 2tt The logarithms of U and V are well-defined since they are non-singular (Gantmacher, 1959). Further since S~ 1 VS = U (62) where S is the Sylvester matrix s = v^ 1 e* 1 e"- 1 e"' 2 1 n-l S" 1 = 5 t (63) From the definition of the logarithm of a matrix, in view of equation (62), we have log V= Slog US' 1 (64) Therefore, = -iV=-k>g£7 2tt P = -iJ^-S(logU)S- 1 Y 2tt (65) where Iogl/=(loge)diag(0,l,2,...«-l) (66) 238 Uncertainty Principle and Foundations of Quantum Mechanics The commutator of Q and P is then K„ = [<?, P] rs = -^-[log U, 5(log ms-% LIT (67) where the matrix indices are labelled from (0, 1,2, . . . n-1). Explicitly evaluating equation (67), we have If e r ' s =x = l, then ^ = _(log^ (r _ 5) "- M£U ,- s) 2ir u=o 2tt If x * 1, since x" = 1, there results doge) 2 /Vr* 2tt *-4r^-J since Thus, we find [an.-^<*-o n-1 ft u=0 ■* L \(n{n-\) ife r " s = l ife ,- Vl (68) (69) (70) (71) (72) ■1 We notice that this commutator is off-diagonal and hence trace-free, as it should be for bounded operators. Alternately, we can directly sum the expression in equation (68) which yields [Q,P] rs = 0, r = s (loge — (r-s)-n[---cotg-(s-r)J,r^; (73) We call the commutation relations (72) or (73) quantum mechanics in discrete space (QMDS). Let us now evaluate the following commutator which we shall use when we discuss the application of QMDS to the algebra of angular momentum operators later. Let n = [Q, V] (74) From equations (65) and (66) it is clear that a = -iV — log e [N, SUS' 1 ] " 2tt where the matrix JV = diag{0,l,2,...(n-l)} Explicit evaluation of equation (75) yields where the matrix K = Sl = iyJ— (log e)K 2tt 1 ... 1 ... 1 ... Santhanam 239 (75) (76) (77) where ... 1 L — (n — 1) ... = V-L L ~in ... oJ (78) (79) 6. LIMITING CASE We shall now show that the commutator of the bounded operators given by equation (72) does reduce to the usual form of Heisenberg in the limit of continuous spectrum, valid as n -» oo. Beginning with equation (68), we relabel the rows and columns from -(n — 1)/2 to (n — 1)/2 instead of from to n - 1 and replace the sum by the integral. In other words, we let the matrix indices take continuous values and take the limit «-»oo. This is the method of Heisenberg (193 1) and Dirac (1930) to pass from the discrete to the continuous case. Then we have, (l og£ ) 2 f°° [Q,P\s = ^ (r—s)- u exp{2mu(r-s)/n}du 2ir J-.*, A t °° / \ = -i(r-s)— exp{27r/w(7--5)/n}d(-J d(r-5)J_oo \n/ = -i(r-s)S'(r-s) = iS(r — s) 240 Uncertainty Principle and Foundations of Quantum Mechanics where S, 8' are the Dirac delta function and its derivative, respectively. We have used the fact that in the limit considered above log e = 2m/ n, i.e. we have retained only its principal part. This is exactly what Weyl does, by choosing the parameter 17 infinitesimal. The same limiting method can be demonstrated by starting from equation (73). Thus, QMDS reduces to the usual theory in the limit mentioned. Since the representation of the Weyl group is unique, it is clear that QMDS is unique too. Since the commutator (QMDS) is off-diagonal, the diagonal measurements 'commute' and hence there is no 'uncertainty'. However, the commutator is not zero as in the classical case. 7. APPLICATION TO ANGULAR MOMENTUM In this section, we apply the concept of QMDS studied in detail in the previous sections to the study of the angular momentum operators which provide an excellent example of bounded operators with a discrete spectrum. We shall reformulate the angular momentum algebra by quantizing the azimuthal direction instead of the zenithal angle (Levy-Leblond, 1973). We also briefly remark on the related problem of defining a phase canonically conjugate to the number operator which has a lower bound. Denoting the generators of rotation by J x , J y and J z , we know that they satisfy the commutation relation [Ji,Jj] = ie iik J k , i, jk cyclic (80) or in the Cartan canonical form (choosing J z diagonal) where J ± = J x ±U y (82) What is usually done is to choose a basis diagonal in J z and J 2 = J\ + J y + J z i.e. J z 4i' m = mijj' m /Vm=;'(/ + i¥L -j^rn^j By polar decomposing /+ we have J+ = J T Y (84) with J T Hermitian and Y unitary. Taking the adjoint yields /_ = (/ + ) t ='T- 1 / T (85) It follows that (81) (83) J + J- = J 2 T =J 2 -J 2 Z +J Z (86) Santhanam 241 We define the transverse component J ± as J+=YJ X (87) and hence J 2 ± =j\J + = J 2 -J 2 z -J z ... (88) In the (J 2 , J z ) diagonal basis given by equation (83) we find from equations (86) and (87) that (choosing a phase convention) </m|/ T |/>i> = S mn [(j + m)(j-m + 1)] 1/2 (jm |/J/«> = S mn [(j -m)U + m + 1)] 1/2 (89) (90) It follows from equations (84) and (89) and equations (87) and (90) that in this basis the operator Y is just the cyclic permutation matrix A defined in the last section. In fact, and y=A N=-J z +jI (91) (92) where / is the unit matrix. From equation (77) it can be seen that [J Z ,A]-A=-L (93) which has been derived by Levy-Leblond. It is also clear f om equation (72) that [J z , 4>] rs = /(log e)-^T-T for e r ~ V 1 e — 1 = fore'- s = l (94) Since y(=A) acts as a cyclic permutation matrix for the basis in which J z is diagonal it follows (Schwinger) that (J 2 , Y) diagonal basis given by J 2 \i,n) =/(/+ DI/» Y\j,fi) = iu,\j,fi) with H = e i , £=-]',...+]', is connected by the Sylvester transform to the (J 2 , J z ) basis equation (83), i.e. (95) \UC)= Z S (m \j,m) 1 (2/ + 1) 172 I e mC \j,m) (96) 242 Uncertainty Principle and Foundations of Quantum Mechanics The matrix elements of J z in the new basis is given by 2} + \ m =-i (97) Of course, one knows that in a finite dimensional space we can go from one basis (jm) to another basis (/£) by a unitary transformation (s). But what is important has been the fact that the operator Y (not J ± ) acts as a cyclic permutation matrix in the J z diagonal basis and the commutation relation of Y with J z is furnished by equation (93) which carries the essence of our QMDS. From equations (92) and (95) it follows that [N,<f>l s = -i(\oge) fore r - s = l f or r - s # 1 (98) Thus the number operator N (with spectrum bounded below) and the phase operator are conjugate in the sense of equation (98). It is perhaps a great luxury to demand that they must be canonically conjugate. 8. CONCLUSIONS We have discussed the quantum mechanics of bounded operators with a discrete spectrum acting on a finite dimensional Hilbert space. We have shown that in the limit of continuous spectrum with the dimension going to infinity one gets the usual theory. As an illustration we have studied the algebra of angular momentum operators reformulated by quantizing the azimuthal angle. We have briefly remarked about the phase operator canonically conjugate to the number operator. We believe that the QMDS can be applied to a system with periodicity like the cyclic lattice. Also, we may avoid many difficulties (divergences etc.) if we work with a finite number of states and eventually take suitable limits. It is to be seen how QMDS works with realities. ACKNOWLEDGEMENTS I thank Professors B. Gruber and W. Hink for their gracious hospitality at the University of Wiirzburg. The article was written during my stay at the Inter- national Centre for Theoretical Physics, Trieste. I am grateful to Professor Abdus Salam and the I.A.E.A. for their hospitality. A brief discussion with Professor C. N. Yang is gratefully acknowledged. Santhanam 243 REFERENCES Boerner, H. (1963) Representations of Groups, North-Holland, Publishing Co., Amsterdam, Chap. 8. Brauer, R. and Weyl, H. (1935) 'Spinors in n dimensions', Am. J. Math., 57, 425-449. Cartier, P. (1966) 'Quantum mechanical commutation relations and theta functions', Proc. Symp. Pure Math., 9, 361-383. Dirac, P. A. M. (1926) Proc. Roy. Soc., 109 A, 642. Dirac, P. A. M. ( 1 930) The Principles of Quantum Mechanics, Oxford University Press, London . Dixmier, J. (1958) 'Sur la relation i(PQ - QP) = V, Compositio Math., 13, 263-269. Gantmacher, F. R. (1959) Matrix Theory, Vol. I, Chelsea, New York, p. 239. Heisenberg, W. (1925) Zeit. Phys., 33, 879. Heisenberg, W (1931) The Physical Principles of the Quantum Theory, Dover, New York. Levy-Leblond, J. M. (1973) 'Azimuthal quantization of angular momentum', Rev. Mex. Fis., 22, 15-23. Mackey, G. W (1949) 'On a theorem of Stone and von Neumann', Duke Math. J., 16, 313-326. Morinaga, K. and Nono, T. (1952) J. Sci. Hiroshima Univ., A6, 13. Moris, A. O. (1967) 'On a generalized Clifford algebra', Quart. J. Math. Oxford (2), 18, 7-12. Nelson, E. (1959) 'Analytic vectors', Ann. Math., 70, 572-615. von Neumann, J. (193 1) 'Die Eindeutigkeit der Schrodinger'schen Operatoren', Math. Ann., 104, 570-578. Ramakrishnan, A. (1967) 'Dirac Hamiltonian as a member of a hierarchy', J. Math. Anal. Appi, 20,9. Ramakrishnan, A., Chandrasekaran, P. S., Ranganathan, N. R., Santhanam, T. S. and Vasudevan, R. (1969a) 'Generalized Clifford algebra and the unitary group,/. Math. Anal. Appi, 27, 164. Ramakrishnan, A., Santhanam, T. S. and Chandrasekaran, P. S. (1969b) 'Representation theory of generalized Clifford algebras', /. Math. Phys. Sci. (Madras), 3, 307. Rellich, F. (1946) 'Der Eindeutigkeitssatz fur die Lasungen der quantum-mechanischen Vertaus- chungsrelationen', Nachr. Akad. Wiss. Gottingen, Math. Physik., Kl, 107-115. Santhanam, T. S. and Tekumalla, A. R. (1976) Quantum Mechanics in Finite Dimensions, Foundations of Physics, 6, 5, 583-587. Schwinger, J. (1960) 'Unitary operator bases', Proc. Nat. Acad. (USA), 46, 570-579. Stone, M. (1930) 'Linear transformations in Hilbert space III, operational methods in group theory', Proc. Nat. Acad. Sci. (USA), 16, 172-175. Weyl H. (1931) Theory of Groups and Quantum Mechanics, Dover, New York, pp. 272-280. Yamazaki, K. (1964) 'On projective representations and ring extensions of finite groups', /. Fac. Sci. Univ. Tokyo, Sect., T 10, 147-195. PART Formal Quantum Theory Four Approaches to Axiomatic Quantum Mechanics STANLEY P. GUDDER University of Denver, Denver, U.S.A. 1. INTRODUCTION This is a survey article on contemporary approaches to axiomatic quantum mechanics. There are, at present, four main frameworks within which axioma- tic quantum mechanics is being studied. These are the classical approach, the algebraic approach, the quantum logic approach and the convexity approach. Each of these approaches has its advocates and critics, its strengths and weaknesses, its history and literature. To do justice to any one of these approaches, an entire volume could easily be dedicated to each. Therefore, by necessity, this.survey must be fairly superficial. I shall include an introduction to the framework of each approach, some of the interrelations between them and a few of the important results they encompass. I hope to give the reader a unifying viewpoint, expose him to results that are scattered throughout the literature and not previously compiled in one place, and finally to announce some little known and new results. The importance of axiomatic quantum theories to mathematics and physics has perhaps not been sufficiently recognized. This field is not only important in its own right, but has had tremendous influence and spin-off to other areas. For example, in the physical sciences there are important applications of results, methods and concepts of these theories to statistical mechanics, ther- modynamics, turbulence, solid-state physics and laser physics, among others. In mathematics, many of the most active areas of research owe their original conception and/or later development to axiomatic quantum mechanics. These include: Hilbert spaces, self-adjoint and symmetric operators, spectral theory, general operator theory, von Neumann algebras, Lie groups and algebras, group representations, Schwartz distributions, C*-algebras, Jordan algebras, modular lattices, orthomodular lattices, continuous geometries and functions of several complex variables. Axiomatic quantum mechanics is a prime exam- ple of the fertile interplay between mathematics and physics. Even if the present methods, concepts and results prove to be absolete and are eventually superseded, the applications and mathematics that they have inspired will justify their existence. 247 248 Uncertainty Principle and Foundations of Quantum Mechanics It is generally recognized that the two most basic concepts in quantum mechanics are those of a state and an observable. These two concepts serve as the basic building blocks of most axiomatic theories and, in particular, the approaches considered here. Each of the four approaches of this article will have as its primitive axiomatic elements one of these entities and this is one of our unifying themes. More specifically, in the classical approach, the observa- bles are assumed to be self -adjoint operators in a Hilbert space; in the algebraic approach, the observables are taken to be elements of a C*-algebra; in the quantum logic approach, the axiomatic elements are certain types of observa- bles called propositions; and in the convexity approach, the states are taken to be elements of a convex structure. 2. THE CLASSICAL APPROACH The classical approach to axiomatic quantum mechanics is not only the prototype for other approaches, it is the most widely used and probably the most popular among physicists. It was originated by Dirac (1930) and von Neumann (1932). There are three equivalent formulations for this approach. (A) Formulation 1 The observables O of a physical system are described by self-adjoint linear operators acting on a complex Hilbert space H. Thus in this formulation, the observables are the basic axiomatic elements. We shall usually identify an observable with its corresponding operator. A state of a physical system is a complete description of the preparation or condition of the system. In quantum mechanics the state is determined by the expectations or average values of the observables when the system is prepared according to that state. Thus, the states can be described by the set of expectation functional & on O. For s e Sf and A e O, we define s(A) to be the expectation of A in the state corresponding to s. Of course, for an unbounded observable A, the expectation s{A) may not exist. It is usually convenient, therefore, to consider the set of bounded observables O b . These are described by bounded self-adjoint operators. The set O b is still large enough to determine a state. What are the properties of the functional s(A) for s e Sf, A e O b ? First, the identity operator / corresponds to the observable that always has the value one so s(I) = 1. Also, if A is a self-adjoint operator with a non-negative spectrum (this corresponds to an observable with non-negative values) then we should have s(A)>0. Furthermore, one can argue that two observables whose self -ad joint operators commute are simultaneously measurable; that is, a measurement of one does not interfere with a measurement of the other. In this case, the expectation of their sum should be the sum of their expectations. In Gudder 249 slightly more general form s(aA +&B) = as(A)+Ps(B) for all a, fi eR (R is the set of real numbers) whenever A and B commute. Finally, a continuity condition is imposed. This is justified by the fact that if two observables are 'close', then their expectations should also be 'close'. But how do we define 'close' mathematically? Here it is defined in terms of strong convergence. That is, a sequence of bounded operators A,- converges strongly to a bounded operator A if A,<f> -> A<f> for every <f>eH. The continuity condition is given as follows: if a sequence of bounded observables A, converges strongly to A, then s(A,) + s(A). Using a truly amazing theorem due to Gleason (1957), the above four conditions give the following characterization of states. For every s e SP there is a positive trace class operator T s (the density operator) of trace 1 such that s (A ) = Tr ( T,A ) for every A G O b . Thus it follows that a state is not only a linear functional on O b but it is given by a density operator. There are many interesting consequences of these simple, far-reaching axioms. We now mention two of them and others will be seen later. Let <f> be a unit vector and let P& be the one-dimensional projection onto the subspace determined by <f>. Then P+ is a self-adjoint operator and also a density operator. Thus P+ can be interpreted as both an observable and a state. This dual nature of Pj, makes it possible to define transition probabilities in a succinct manner. The significance of i% as a state is that P^ is a pure state since it cannot be written as a convex combination of other states. As an observable, P# can be interpreted as corresponding to the statement, 'the system is in the pure state /y. If P* and P^, are two pure states, then Tr (P*P*) is the expectation of the observable P+ in the state P+ and is interpreted as the probability that the system will be found in the state P^, when we know it is in the state P+. This is the transition probability between the two states and is given by Tr(P,^V) = <6JV^ = |<<fc*>| 2 Notice that for a pure state P^, the expectation of an observable A is given by Tr (P+A) = {<f>, A<f>). Another interesting consequence of the axioms is the fact that the product of the variances of two observables is given by s([A -s(A)?)s([B -s(B)f) > l/4[s([A, B])f This inequality provides a lower bound for the simultaneous measurability of A and B and is a mathematical formulation of the Heisenberg uncertainty principle. It also shows that two commuting observables are simultaneously measurable which substantiates our earlier statement to that effect. The dynamics of the system is also easily formulated within this framework. If an observable is given by the operator A at time t = 0, then this observable is given by an operator W,A at time t. This is the Heisenberg picture of the dynamics. There is an equivalent formulation called the Schrodinger picture in which the observables are kept constant and the states are assumed to evolve. If the system is in a pure state given by a unit vector <£ at time t = 0, then at time t the system is in a pure state given by a unit vector Wrf>. It follows by a theorem 250 Uncertainty Principle and Foundations of Quantum Mechanics of Mazur and Ulam (1932) (there is also a related theorem due to Wigner (1931)) that W, is given by a unitary operator U t . Furthermore, since (4>, WA4>) = <U4>,AU4>) = (4, U7 x AUrf>) we see that WA = U^AU t . If the state at time t x is given by U tl <f>, then the state at time t 2 + t r is given by U t2+ti <t> = U, 2 U tl <f> and so we must have U, 2+tl = U, 2 U tl for every t u t 2 e (-00,00). Furthermore, letting t 2 = and t 2 = -t x in the above identity gives U = I and U-, = UT 1 so U, forms a one-parameter group of unitary transfor- mations. It is also usually assumed that t >-> U, is strongly continuous. It follows from Stone's theorem (Stone, 1930) that there exists a unique self-adjoint operator H such that U,<f> = e~ iH °'4> for all te (-oo, oo). The operator H is identified with the Hamiltonian of the system. The differential form of the evolution laws become: ^-<f>, = ^-U4> = -iH Ud> = -iH <t>, at at ^-A, = ^ UT'AU, = i[H , UT'AU,] = i[H Q , A t ] at at The first of these equations is called Schrodinger's equation. (B) Formulation 2 Formulation 1 is the prototype of the algebraic approach to axiomatic quantum mechanics. We next briefly consider an equivalent formulation which is the prototype to the quantum logic approach. In this formulation the primitive axiomatic elements are the 'propositions' of a physical system. A proposition of a system represents a special type of observable that has at most two values and 1 or true and false. For example, a counter which is either activated or unactivated is a proposition. A filter that passes only certain types of particles is a proposition since a particle either passes or does not pass. The basic postulate is that the propositions of a physical system are described by the set of closed subspaces 9> of a complex Hilbert space H. This is equivalent to describing the propositions by orthogonal projections on H. The sets of states Sf' of the system now determine the probabilities that propositions are true. Thus, if s e Sf ', P e 5 s then s(P) e [0, 1]. Since the identity projection / corresponds to the proposition that is always true, we must have s(I) = 1 for every s e Sf'. If P, is a sequence of mutually orthogonal projections (i.e. PiPj = 0, / * j) then the probability that £ P, is true should be the sum of the probabilities that each P, is true. We therefore postulate that s£ P) = Z s(P) for every s e Sf . In other words a state is described by a 'probability measure' on &. Again by Gleason's theorem, if s e Sf' there exists a unique density operator T s such that s(P) = Tr (TJ>) for every P € 9>. Gudder 251 The general observables come into the theory in the following way. If X is an observable and £ is a Borel subset of the real line then the pair (X,E) corresponds to the proposition: 'X has a value in the set E\ Thus X can be thought of as a map from the Borel sets B(R) into @. It is easy to justify that X: B(R)^ & should satisfy (1). X(R) = I; (2). IiEf]F=0,thenX(E)X(F) = O; (3). If E t are mutually disjoint, then X(UEi) - ■ixm. Thus X can be thought of as a projection-valued measure. By the spectral theorem their exists a unique self-adjoint operator A such that A = \ XX(dX ). conversely, if A is a self-adjoint operator, then there is a projection-valued measure E<-+P A (E) such that A = J AP A (dA). Thus there exists a one-to-one correspondence between observables and self-adjoint operators. If s e Sf" and A is an observable, then s\P A (E)] is the probability that A has a value in the set E when the system is in the state s. It is then clear that the expectation of A in the state s is s(A) = J As[P A (dA)] = J A Tr [P s P A (dA)] = Tr [r s | AP A (dA)] = Tr(P s A) We thus see that corresponding to a state s e Sf as defined in this formulation there is a state le^as defined in Formulation 1 . Conversely, if s e S" as defined in Formulation 1 then in particular s can act on projections. If we restrict s to ^ then s is a state in Sf as defined in Formulation 2. For this reason, Formulations 1 and 2 are equivalent. (C) Formulation 3 This formulation of the classical approach to axiomatic quantum mechanics is the prototype of the convexity approach. In this formulation the states form the basic axiomatic elements. The basic postulate is that the states of a physical system can be described by density operators S" on a complex Hilbert space H. Convexity comes into play since Sf is a (strong) convex set. That is, if T t € Sf and I A; = 1, A, > 0, then £ k^ e Sf. We call £ A,7; a mixture of the states T t . The extreme points of Sf are the states that cannot be written as mixtures of other states. These states are also called pure states and are given by one-dimensional projections. In Formulation 1, we mentioned that a state is determined by the expectation values it gives to observables. In that formulation the observables were the basic axiomatic elements and the states were derived from the observables. We now turn the situation around. Now the states are the axiomatic elements and we shall derive the observables from the states. Hence if A is a bounded observable (we are not now assuming that A corresponds to a self-adjoint 252 Uncertainty Principle and Foundations of Quantum Mechanics operator, this will follow automatically) and T e Sf a state, we define A ( T) to be the expectation of A in the state T. We now seek the properties of the function T>-+A(T). First, it is natural to assume that this function preserves convex combinations. That is, whenever A, >0, 1 A, = 1. Second, since observables must have real expecta- tions, we assume A(T)eR for every Te Sf. Finally, since states that are 'close' should give expectations that are 'close', a continuity condition is imposed. This continuity condition is usually given in terms of the trace norm. This norm is defined as follows. If T 1 ,T 2 e^ > then there is a unique positive trace class operator T 3 such that T% = {T x - T 2 ) 2 . This operator is denoted \Ti~T 2 \. The trace norm of T x -T 2 is W^-TJtl = Tr|r 1 -T 2 |. The continuity condition becomes: if T b Te Sf and \T X - TH^O as i->oo, then A(r,)-> A(T) for every bounded observable A. If the above three conditions hold, it can be shown (Schatten, 1950) that for any bound observable A there exists a unique bounded self-adjoint operator A on H such that A(T) = Tr (TA ). We thus see that this formulation is equivalent to Formulations 1 and 2. (D) Strengths and Weaknesses Now that we have seen a brief formulation of the classical approach, a natural question is, why look at other approaches? Is there something wrong with this approach that makes it necessary to abandon or modify it? A lucid discussion of this question can be found in (Emch, 1972). Here we shall limit ourselves to a few comments. Let us begin with the strengths of the classical approach. First, and most important, this approach has been highly successful, especially for systems with a finite number of degrees of freedom. Second, it has the advantage of concreteness. The observables can be identified with self-adjoint operators, the states with density operators, the propositions with projection operators and the dynamics with unitary operators on a Hilbert space. Now all that is needed is to specify what Hilbert space is to be used and to give a prescription for the self-adjoint operator that corresponds to each observable. There is a very satisfactory way of doing this in the case of a finite number of degrees of freedom. Suppose the system has 3iV degrees of freedom in a cartesian coordinate system x u x 2 , x 3 ,..., x 3N . Then the Hilbert space is taken to be the space L 2 (i? 3N ,dA) of square integrable complex- valued functions on R 3N . Using the correspondence principle or other heuristic arguments, the position observables are prescribed as Q l f(x) = x i f(x), i = 1,2, . . . ,3N, and the momentum observables as PJ{x)= -ih(d/dXi)f(x), i = 1, 2, . . . , 3N. The other observables are now given in terms of these basic observables. The quantum mechanics is now completely described. Gudder 253 The above procedure is satisfactory for the following reason. The position and momentum operators satisfy the canonical commutation relations [Q,, P k ] = ih8 jk . (To be perfectly rigorous one should work with the Weyl form of the commutation relations but for simplicity we shall be a little imprecise here.) Now by a theorem of von Neumann (1931), if Q°, P°, i = 1, 2, . . . , 3JV, are an (irreducible) set of self-adjoint operators on a Hilbert space H which satisfy the above commutation relations, then H is unitarily equivalent to L 2 (R 3N , dA) and Q°, P? are equivalent to <?,, P t defined above, respectively. Thus, if the framework is to satisfy these basic commutation relations the Hilbert space and the observables (and hence the states, dynamics, etc.) are uniquely determined within a unitary equivalence. Now for some of the weaknesses of the classical approach. Von Neumann's theorem does not extend to systems with an infinite number of degrees of freedom. In fact, one can show that in this case there are infinitely many (in fact, uncountably infinitely many) inequivalent representations of the canonical commutation relations. Each of these representations gives different results. How is one to choose the 'right' representation? In fact, if one chooses the most 'natural' representation, namely Fock space, the results are unsatisfactory when interactions are involved. This problem would be merely a mathematical curiosity if no important physical systm had an infinite number of degrees of freedom, but unfortunately all of quantum field theory lies within this range. Furthermore, this discussion would be unnecessary if the present quantum field theory were successful, but this is far from the case. A second weakness of the classical approach is the basic axioms themselves. Where does the Hilbert space come from? Why describe observables by self-adjoint operators? There seems to be no really convincing reason. In other words, the axioms are ad hoc, devoid of empirical evidence. There are some who believe that the troubles encountered in quantum field theory may be due to the basic axioms. They feel that if these axioms were established on a firmer empirical foundation then many of the difficulties would dissolve. Besides the basic axioms, there is another assumption, of a less important character, which is of a questionable nature. This is the continuity condition placed on the states. In Formulation 1, this was defined in terms of the strong convergence of operators. But there seems to be no physical significance for this type of convergence. In Formulation 2 this convergence is contained implicitly in the countable additivity of states. It is clear that states should be finitely additive but there is no physical reason for them to be countable additive. 3. THE ALGEBRAIC APPROACH The algebraic approach was initiated by Jordan, von Neumann and Wigner (1934). It was later developed by Segal (1947), Haag and Kastler (1964) and many others. In this approach, the bounded observables are taken as the 254 Uncertainty Principle and Foundations of Quantum Mechanics primitive axiomatic elements. We begin with a slight modification of Segal's formulation. (A) Segal Algebras A collection of objects si is called a Segal algebra if si satisfies the following postulates. Axiom A. si is a linear space over the real numbers R. Axiom B. There exists in si an identity / and for every A € si and integer n > an element A" est which satisfies the following: If /, g and h are real polynomials, and /(g(A)) = h{\) for every A e R, then f(g(A)) = h{A); where f(A) = 0oI+ I P*A k if fc=0 : such that the Axiom C. There is defined for each Aesia. real number \\A \\ = pair (, || • ||) is a real Banach space. Axiom D. \\A 2 -B 2 \\^m^(\\A 2 \\,\\B 2 \\) and ||A 2 || = ||A|| 2 . Axiom E. A 2 is a continuous function of A. Of course, a Segal algebra is supposed to describe axiomatically the set of bounded observables for a physical system. The underlying idea is that an observable is determined by its average values as given by laboratory experi- ments. Let A be an observable and A eR. If the average values of A as determined by a laboratory experiment are multiplied by A, then this deter- mines a new observable AA. If the average values of two observables A and B are added, then this determines a new observable A+B whose average values are these sums. This argument justifies Axiom A. Unfortunately, this proce- dure cannot be used to define products since in general the expectation of a product is not the product of the expectations. For this reason, products of observables are not defined. Axiom B states that polynomials in a single observable exist and enjoy the usual properties. The norm in Axiom C can be thought of as the maximum absolute expectation value of an observable. The properties of a norm then easily follow. The completeness is included for mathematical convenience since if the system were not complete it could be completed in the usual way still preserving all the axioms. Axiom D can be justified in terms of the interpretation of the norm given above. Axiom E is a natural continuity condition. We henceforth call the elements of a Segal algebra observables. An example of a Segal algebra is the set of bounded self-adjoint operators on a Hilbert space as given in Formulation 1 of the classical approach. Another example is the Banach space <€(!) of all continuous real-valued functions on a compact Hausdorff space T under the supremum norm. This example corre- sponds to a classical mechanical system. In this case, T corresponds to a phase W Gudder 255 space and C(T) is the set of observables (dynamical variables) which are necessarily compatible (or commuting or simultaneously measurable). Let si be a Segal algebra. A state of si is a real-valued linear functional a> on si such that <o (A 2 ) > for all A e si and co (I) = 1 . The states are supposed to describe the expectation values of the observables for a particular preparation or condition of the physical system. With this interpretation, the above properties of a state are clear. A collection of states SP on si is full if for any two distinct observables A, B there exists a state we^ such that cj(A)^cj(B). Segal (1947) has shown that any Segal algebra has a full set of states and that ||A || = sup {MA )|:<o€<?} for all A e si. This latter fact justifies our interpretation of the norm as the maximum absolute expectation value of an observable. Although products of arbitrary observables are not defined, we can define a 'symmetrized product'. For A, Be si the symmetrized product A ° B is defined by A ° B = §[(A +B) 2 -A 2 -B 2 ]. The physical significance of A ° B is not clear, although it is a convenient mathematical construct. This product does not enjoy very many algebraic properties. It is clearly commutative and from Axiom B, A ° I = A for every A € si. However, in general it need not be homogeneous ((AA)°B) = A(A ° B)), distributive (A°{B + C) = A ° B + A o C), or associative (A ° (B ° O = (A ° B) ° Q. In fact, the Segal algebra of all bounded self-adjoint operators on a Hilbert space is an example in which the associative law does not hold for the symmetrized product. We shall later give an example in which distributivity and homogeneity do not hold. Lemma 1. The symmetrized product is homogeneous if and only if it is distributive. Proof. Suppose the symmetrized product is homogeneous. It follows that -(A o B) = A ° (-B). Writing this out in terms of the definition gives (A+B) 2 + (A-B) 2 = 2A 2 + 2B 2 (1) It follows that A°B = l(A+B) 2 -(A-B) 2 ] (2) Now substitute A + C and A -C for A in (1) to get the following two equations. [(A + B) + C] 2 + [(A - B) + Cf = 2(A + C) 2 + IB 2 [(A+B)-C] 2 + [(A-B)-C] 2 = 2(A-C) 2 + 2B 2 Subtracting these last two equations and using (2) gives (A+B)oC+(A-£)°C = 2(A°0 = (2A)°B Replace A+B by A and A -B by B to get A°C+B°C = (A+B)°C (3) 256 Uncertainty Principle and Foundations of Quantum Mechanics Conversely, suppose the distributive law (3) holds. Then replacing A by B gives 2(5 ° O = (2B) ° C. Now replacing B by B/2 gives (\B) ° C = i(B ° O- Replacing A by IB gives 3(B»0 = (35) ° C, etc. Also, replacing A by -B gives (-B) ° C = -{B ° C)- In this way A (J3 ° C) = (A#) ° C for every rational A. But since addition and squaring are continuous, so is the symmetrized product. It follows by continuity that the symmetrized product is homogeneous. Corollary: If the symmetrized product is associative, then it is homogene- ous and distributive. Proof. If the symmetrized product is associative, then (AA ) ° B = (A/ ° A ) ° B = A7 ° (A ° B) = A (A ° B) The converse of this corollary does not hold as the Hilbert space example shows. We now give an example which shows that the symmetrized product need not be homogeneous (and hence not distributive or associative). This example is a simplified version of one due to Sherman (1956). Let X = R and define addition and multiplication by scalars in the usual ways. Let I = (1, 1, 1) and (al) n = a n l for n>0 an integer. If x = (x u x 2 , x 3 )eX, let x = maxx h x =minjc, and let X = {xeX:x = l,x = -1}. If xeX , define x n =x if n is an odd integer, and x" = I if n is an even integer. If x e X, then it is easy to see that there exists an x e X such that x = ax + pi, a, p e R. Define < n =(ax +PD n =i ("VlS-Vo It is easy to see that x" is well-defined. For xeXwe define ||*|H*HI«*o + j3/|| = max{| j 3-a|,|a+/3|} Sherman (1956) has shown that with these operations X is a Segal algebra. Now let a = (1, 1, 0) and b = (1, 0, 1). We shall show that 2(a ° b) * (2a) "b. Indeed, a =|(1, 1, -1) and b=U\, -1, 1)+^ and so a +Z> =|(1, -1, -l)+z/: Hence 2(ao6) = [(a+6) 2 -fl 2 -A 2 ] = [(4,l,l)-(l,l,0)-(l,0,l)] = (2,0,0) Furthermore, (2a)ofe=§[(2fl + Z») 2 -(2a) 2 -^] = |[(9,5,l)-(4,4,0)-(l,0,l)] = (2,1,0) A Segal algebra is compatible if the symmetrized product is associative. A collection of observables is compatible if the subalgebra generated by the collection is compatible. Segal (1947) has proved that a compatible Segal algebra is isomorphic (algebraically and metrically) with the algebra <€(T) of all real-valued continuous functions on a compact Hausdorff space T considered earlier. It is well known that the states on <€(T) consist of the regular Borel Gudder 257 probability measures on T; that is, if w is a state, there exists a regular Borel probability measure fi on T such that a> (/) = \ f d/x for every / 6 ^(T). It follows from these results that a compatible Segal algebra can be thought of as the set of observables in a classical mechanical system. Furthermore, compatible observ- ables can be thought of as being simultaneously measurable. (B) C*-Algebras The Segal algebras considered earlier were based upon axioms that had physical relevance. Unfortunately, their mathematical structure is so weak that not much further progress has been made in terms of using them for the study of quantum theory. To proceed further, additional axioms have been imposed which have not been given physical justification. One of these is to postulate the distributive law for the symmetrized product. In a mathematical sense, the distributive law is a rather mild required. In fact, by the proof of Lemma 1 this law is equivalent to requiring that (-A)°B = -(A°B) and (2A)°B = 2(A ° B) for all A, Be si. However, the physical reasons for such a require- ment are lacking. Furthermore, additional axioms have been imposed which are not nearly so mild. These are best stated in terms of C*-algebras. We first review the terminology of C*-algebras. A complex algebra is a complex vector space equipped with an identity and a distributive, associative product AB. An involution on a complex algebra % is a map * of 38 into itself which satisfies (A*)* = A, (A+B)* = A*+B*, (AA)* = A*A* and (AB)* = B*A* for every A,5€f and complex A with A* denoting the complex conjugate of A. An involution algebra is a complex algebra equipped with an involution. A Banach algebra is an algebra equipped with a norm such that \\AB\\ < ||A || ||S|| and which is complete in the norm topology. A C*-algebra is an involutive Banach algebra 3& satisfying ||A*A|| = ||A|| 2 for every Aet An example of a C*-algebra is the set 38(H) of all bounded linear operators on a complex Hilbert space H. In this case (*) is the adjoint map and ||-|| is the operator norm. An element A of a C*-algebra is self -adjoint if A * = A. It is straightforward to show that the set of all self-adjoint elements of a C*-algebra form a distributive Segal algebra. In this case the symmetrized product A ° B = U.(A +B) 2 -A 2 -B 2 ] takes the simple form A ° B = \(AB+BA). A distribu- tive Segal algebra is said to be special if it is isomorphic to the set of all self-adjoint elements of a C-algebra. Important unsolved problems are whether every distributive Segal algebra is special and if not to characterize special Segal algebras. These problems appear to be very difficult and it seems unlikely that an arbitrary distributive Segal algebra is special. However, all Segal algebras that have been encountered in physical situations have been special. For this reason and also because the theory of C*-algebras has a rich mathematical development it is postulated that the Segal algebra correspond- ing to a physical system is special. It can be shown that any state w of a special 258 Uncertainty Principle and Foundations of Quantum Mechanics Segal algebra can be extended to a positive (i.e. <o(AA *) s 0), normalized (i.e. cod) = 1) linear functional on the C*-algebra. Having made this postulate, we can proceed to the study of C*-algebras interpreting the self-adjoint elements as observables and the positive, normalized, linear functional as states. One of the important consequences of this last postulate is that it provides the mechanisms for representing the elements of a Segal algebra as self-adjoint operators on a Hilbert space. This follows from the GNS construction (after Gelfrand and Naimark (1943) and Segal (1947)). We now develop the neces- sary material to understand this construction. A map tt of a C*-algebra 38 into the set 38(H) of all bounded linear operators on a Hilbert space H is said to be a representation of 38 if (1). Tr{aA 4-/85) = air{A)+pTr(B); (2). tt(AB) = tt(A)tt(B); (3). 7r(A*) = 7r(A)*; for all A B e 38 and complex numbers a, 0. It can be shown that if tt is a representation of 38, then \\tt(A )|| < \\A || for every A e 38. Thus a representation is automatically continuous. A representation tt: ® -> 38(H) is cyclic if there is a vector + eH such that the subspace ir(»)* = {ir(A)* : A e ®} is dense in H. In this case, # is said to be a cyclic vector for tt. Cyclic representations play an important role in the theory; in particular, it is easy to show that any represen- tation is the direct sum of cyclic representations. In many physical applications the vacuum state plays the part of a cyclic vector. A positive, normalized, linear functional on a C*-algebra 38 is called a state. Let tt- 38 -»• 38(H) be a representation of 38 and let ip e H, U\\ = 1. Then the functional <o(A) = (iff, ir(A)*) is a state called a vector state associated with the representation tt. A state is pure if it cannot be written as a convex combination of other states. Let tt: 3D -» 38(H) and tt': 38 -> 38(H) be two representations of 38 If there is an isomorphism U: H+H' such that tt'(A) = t/ir(A)IT then tt and tt' are spafta//y (or «mtori/y) equivalent. A closed subspace M of H is invariant with respect to tt: 38^ 38(H) if tt(A)M<=M for every A e 38 A representation tt: 38 -» 38(H) is irreducible if the only invariant subspaces of H with respect to tt are {0} and H. Irreducible representations are the most economical in the sense that they cannot be written as a direct sum of representations. Theorem 2. (The GNS Construction) Let 33 be a C*-algebra and let at be a state on 38. Then there exists a Hilbert space H and a cyclic representation 77.,: 38 -» 38(H) with cyclic vector ^ e H such that <y(A) = <& *r(A W> for every A*e» If 77'- 38-* 38(H) is another cyclic representation with cyclic vector ,/r'eH such that a>(A) = <</>', »r'(A)*'> for every A e33, then tt and tt' are spatially equivalent. Furthermore, tt„ is irreducible if and only if w is pure. The GNS construction has important physical consequences, one of which is the following. It is not the Hilbert space and the self-adjoint operators on it as postulated in the classical approach of Section 2 that contains the essence of the Gudder 259 physical system, but it is the C*-algebra generated by the Segal algebra. The Hilbert space and the operators on it corresponding to the observables depend upon the state of the system and can be obtained via the GNS construction. There may be many inequivalent representations of a C*-algebra and the 'right' one is determined by the state of the system. The above observation overcomes, to a certain extent, one of the weaknesses of the classical approach mentioned in Section 2, namely where the Hilbert space and self-adjoint operators come from. Another weakness mentioned in Section 2 that is overcome is the continuity condition placed on the states of the classical approach. In the algebraic approach, no such condition is imposed. The states are defined algebraically in terms of the physically natural conditions of linearity, positivity and normalization. In fact, there are more states in the algebraic approach than those given by the density operators of the classical approach. Furthermore, these extra states actually occur in physical situations. An example of such a state can be given as follows. Let si be the Segal algebra of all bounded self-adjoint operators on an infinite dimensional Hilbert space. Then si contains an operator A with non-empty continuous spectrum cr c (A). Let A ea c {A). It can be shown (Segal, 1947) that there exists a pure state at such that <u(A) = A. Now this pure state cannot have the form at (A) = (iff, Aif>) since then \p would be an eigenvector of A with eigenvalue A, which contradicts the fact that A ea c (A). Such states as the above are delta function-like or Schwartz distribution-like elements that lie outside the Hilbert space. It can be shown (Emch, 1972), however, that all states of si can be approximated in a certain topology by density operator states. Although we have now formulated the basic concepts of the algebraic approach, we have only scratched the surface of its later developments. We have not mentioned such important areas as physical equivalence, symmetry groups, representations of canonical commutation and anticommutation rela- tions, quasilocal field theories and applications to concrete problems. For further study the reader is referred to (Emch, 1972) and the modern literature. (C) Strengths and Weaknesses One of the strengths of the algebraic approach is that it is based on axioms that have more physical relevance than in the classical approach. This is especially true of the axioms for a Segal algebra. It has also clarified the existence of inequivalent representations of the CCR and CAR. It has enjoyed some notable successes that we have not had space to explore and is responsible for important applications to such areas as statistical mechanics and solid-state physics. Weaknesses of this approach include some of the later axioms, especially the jump from a Segal algebra to a C*-algebra. What, for example, is the physical significance of the product in a C*-algebra? Furthermore, even with all the mathematical power that has been brought to bear, a satisfactory field theory has still not be developed. 260 Uncertainty Principle and Foundations of Quantum Mechanics 4. THE QUANTUM LOGIC APPROACH In the quantum logic approach the propositions of a physical system are taken as primitive axiomatic elements. The propositions correspond to yes-no (or true-false) experiments on the physical system. For example, suppose the physical system consists of a single particle and let E be a region of space. If a E is a counter which is activated if and only if the particle enters the region E, then a E corresponds to a proposition. This proposition is true if and only if a E is activated and the particle is in the region E. The propositions correspond to two-valued observables and it can be argued (we shall substantiate this later) that an observable can be decomposed into these simpler two-valued observa- bles. Thus a treatment of propositions is general enough to describe ^a 1 observables. The standard references on this approach are Jauch (1968), Mackey (1963) and Varadarajan (1968). (A) Quantum Logics Let 0> o be a set of elements called experimental propositions. If a e 0> o then a is true or false depending upon the state of the system. But in quantum mechanics one cannot, in general, predict whether a proposition will be true even if the state is precisely known. All one can predict is the probability that a proposition will be true. Thus a state m can be thought of as a function from <P to the umt interval [0, 1] and m(a) for a e 0> o is interpreted as the probability a is true when the system is in the state m. If m (a ) = 1 then a is true with certainty in the state m. Now suppose a,beP and m(a) + m(b)*l for every state m. Since m(a) = 1 implies m(b) = 0, whenever a is true with certainty, b is false with certainty. In this case a and b can be interpreted as corresponding to non- interfering experiments and their truth or falsity can be verified simultane- ously In this case one can consider the experimental proposition c which is true with certainty precisely when a and b are both false with certainty. Then we should have m(c) + m{a) + m(b) = 1 for every state m. For example^ our counter experiment suppose E 1 and E 2 are disjoint regions of space Then for any state m the probability the particle is in E l plus the probability the particle is inE 2 does not exceed unity, m(a El ) + m(a E2 )< 1. Nowif E 3 = (£i UE 2 ) (L is the complement of the set E) we should have m(a E3 ) + m(a B2 ) + m(a El )- 1. Such considerations also carry over to sequences of propositions. A proposition system is a pair {0> o , M) where 0> o is a non-empty set and M is a non-empty set of functions from @ into [0, 1] satisfying: Axiom A. For any sequence a 1; a 2 , . . . e& such that m(a,) + m(a,):< 1, / *j, for every meM, there exists b e& such that m{b) + m(a 1 ) + m(a 2 ) + . . . - 1 for every meM. Axiom A is the only axiom that we shall impose on the system. We call the elements of M states and the element of 0> o experimental propositions. Now Gudder 261 suppose a, be 0*0 and m(a) = m{b) for every meM. Then a and b are physically indistinguishable and we write a ~ b. It is clear that ~ is an equivalence relation and we denote the equivalence class containing a by a. Furthermore, if m e M we define m(a) = m(a). We denote the set of equiva- lence classes by 9 and call the elements of 0* propositions. We call the pair (0>, M) a quantum logic. Thus a quantum logic is a pair (9, M) satisfying Axiom A (with 0>o replacedby 0>) together with the condition m{a) = m{b) for every meM implies a = b. The quantum logic will be the main framework of our study and for simplicity we shall drop the bars in the following. Our next order of business is to prove the main structure theorem for quantum logics. But first we need some definitions. Let (0>, ^) be a partially ordered set with first and last elements and 1, respectively. An orthocomp- lementation on 9 is a map a>-^a' from 9 to 0* with the following properties: (1). a" = a for every ae@; (2). a < b implies b'<a'; (3). a v a' = 1 for every aef. In (3) a v a' denotes the least upper bound of a and a'. If (9, <) has an orthocomplementation ('), then (9, =£, ') is called an orthocomplemented poset. It is easily verified that in an orthocomplemented poset if a v b exists then so does a' a b' and a' a b' = (a v b)'. If a ^ b', we say that a and b are orthogonal and write alb. An orthocomplemented poset (9, <, ') is cr-orthocomplete if the following holds: (4). If Ci, a 2 , . . . is a sequence of mutually orthogonal elements in 9 then a 1 va 2 v.., exists. An orthocomplemented, o--orthocomplete poset {9, s, ') is orthomodular if (5). asj implies b = av(b a a')- A probability measure on a cr-orthocomplete poset (9, :£, ') is a map mitP-* [0, 1] which satisfies: (a) m(l) = l; (b) if a i, a 2 , . . . is a sequence of mutually orthogonal elements of & then m(va i ) = f 4 m(a i ). A set M of probability measures on (9, <, ') is order determining if m(a)^ m(b) for every meM implies a ^ b. Let us now return to the quantum logic (0>, M). If a, be 0> define a < b if m(a)<m(b) for every meM. The relation a<£ can be interpreted as a implies b. That is, b has a greater probability of being true than a. It follows that whenever a is true with certainty so is b. If a e 9, since m(a)^l for every meM, by Axiom A there exists be 9 such that m(b) = l-m(a) for every meM. We then write 6 = a' and call b the orthogonal complement of a. We can interpret a' as the proposition which is true if and only if a is false. We have thus defined a relation < on 9 and a map ('):*?-* ^. We now prove our main 262 Uncertainly Principle and Foundations of Quantum Mechanics structure theorem. This theorem is due to Maczynski (1974) and the proof is a simplification of his. Theorem 3. ($>,M) is a quantum logic if and only if (0\ <,') is a cr- orthocomplete orthomodular poset and M is an order-determining set of probability measures on 0*. Proof. First suppose (0», M) is a quantum logic. It is clear from the definition that (@, <) is a poset. It is also clear that (') satisfies conditions (1) and (2) of an orthocomplementation. For ae^,we have m{a) + m(a') = 1 for every m e M so by Axiom A there is an element Oe <3> such that m(0) + m(a) + m(a') = 1. It follows that ro(0) = for every meM. Define 1 = 0'. Notice that 0< a < 1 for every a e & so and 1 are the first and last elements of 0>, respectively. If b>a,a' then m(a),m(a')^m(b). Hence m(a) + m(b')^l and m(a') + m(b')<l. Then a,a',b' satisfy the condition of Axiom A so there exists ce@ such that m(c) + m{a) + m(a') + m(b') = l for every meM. But then m{c) = m(b') = for every meM. Hence m(b) = 1 for every m e M and b = 1. Hence (3) is satisfied and (') is an orthocomplementation. Now albif and only if m(a) + m(6)<l. Thus if a^,a 2 ,... is a sequence of mutually orthogonal elements, then by Axiom A there exists be& such that m(b) + m( ai ) + m(a 2 ) + ... = l. Hence m(ft') = ImW for every meM. It follows that b'>a lt a 2 ,.... Now suppose c>a u a 2 ,.... Then m(a,) + m(c')^ 1 for every meM. Hence the sequence c', a t , a 2 , . . . satisfies the ' condition of Axiom A so there exists d e & such that m(d) + m(c')+I micii) = 1. It follows that b' = va, and 0> is o-orthocomplete. Furthermore, since m(va,) = m(6') = !>(«.) and m(l) = l, it follows that every meM is a probability measure on 0>. It is obvious that M is order determining. To show that 0> is orthomodular, suppose a^b. Then a Lb' and since a<avb' = (a'Ab)',a±a'Ab. Hence for every meM m[av(a' Ab)] = m(a) + m(a' Ab) = m(a) + l-m(avb') = m(a) + l-m(a)-m(b') = m(b) Therefore, b = av{a' a b). Conversely, suppose (0>, <, ') is a cr-orthocomplete orthomodular poset and M is an order-determining set of probability measures on 3>. Then m(a) = m(b) for every m eJ^ implies a=b. Suppose a u a 2 , ... is a sequence in 0> such that m (a,) + m (a,) <1, i #;',foreverym e^.Thenm(a,)< 1 -m(aj) = m(aj) so a, 1 a,, i 5*7. Hence b = va, exists and m(b') +1 m(a,) = 1. Thus Axiom A holds and ($>, M) is a quantum logic. We thus see that there is no difference between a quantum logic and a o-orthocomplete orthomodular poset with an order-determining set of proba- bility measures. Notice that a quantum logic need not be a lattice (that is, a v b and a a b need not exist). For example, let (1 = {1, 2, 3, 4, 5, 6} and let 0> be the collection of subsets of £1 with an even number of elements. Order & by inclusion and let (') be the usual set complementation. For i = 1, . . . , 6 define for ae& m,(a)=l if iea and m t (a) = if i£a. Then if we let M = Gudder 263 {m, ,: i = 1, . • • 6} it is easily verified that (0>, M) is a quantum logic. However, $> is not a lattice since, for example, {1, 2, 3, 4} a {2, 3, 4, 5} does not exist. We say that two propositions a, bare compatible (written a ** b ) if there exist mutually disjoint propositions a u b x and c such that a = a 1 \/c,b = bivc. We shall see that compatible propositions are ones that can be verified simultane- ously; that is, propositions whose experiments do not interfere. Notice, if a lb then a++b and 0«-»a, \**a for every a e &. Physically, our interpretation of a < Z> demands that a*-*bifa^b. This is indeed the case since if a ^ b then by {5)b = a v(b a a') and a = a vO where a±(b Aa'). We now show how observables can be defined. If x is an arbitrary observable and EeB(R) is a Borel set on R, then the pair (x, E) corresponds to the proposition: 'the observable x has a value in the set E\ Thus if (0*, M) is a quantum logic, an observable can be thought of as a map x:B(R)^>8P. Furthermore, an observable should satisfy: (1). x(R) = l; (2). If EHF= 0,thenx(E)±x(F); (3). If EieB(R) is a sequence of mutually disjoint sets, then x(UEi) = v x(E t ). The reader can easily justify these three conditions. Two observables x, y are compatible (written x+*y) if x(E)++y(F) for all E,FeB(R). We shall show later that observables which are compatible may be thought of physically as being simultaneously measurable. The reader should note that we have constructed a generalized probability theory. Instead of being a Boolean cr-algebra of subsets of a set, our events (propositions) are more general, belonging to a less restrictive structure. The usual probability measures are replaced by states and the random variables by observables. Notice that if x is an observable and m a state, then the probability that x has a value in E e B(R) when the system is in state m is m[x(E)]. Before proceeding further, let us consider two examples of quantum logics. Example 1. Let D. be a phase space and let B(Cl) be the Borel subsets of O. B(Cl) may be thought of as the set of mechanical events. It is easily checked that B(H), under set inclusion and complementation, is a o--orthocomplete, orthomodular poset (in fact, it is a Boolean cr-algebra). The set of states M are the conventional probability measures on ^(fl) and these are order determin- ing so (B(Cl), M) is a quantum logic. If x is an observable, it follows from a theorem of Sikorski (1949)-Varadarajan (1962) that there is a measurable function f:Cl-*R such that x(E)=r\E) for every EeB(R). Thus observa- bles are just inverses of dynamical variables. We thus see that the quantum logic generalizes classical mechanics and also the conventional Kolmogorov (1956) formulation of probability theory. It is easily checked that all events (propositions) and observables are compatible in this example. Example 2. Let H be a complex Hilbert space and let & be the collection of all closed subspaces of H. Ordering *3> by inclusion and defining the comple- 264 Uncertainty Principle and Foundations of Quantum Mechanics ment of a subspace as its orthocomplement it is easily checked that 9> is a <r-orthocomplete, orthomodular poset (in fact, a lattice). Furthermore, the set of states M, by Gleason's theorem (see Section 2), are given by density operators and are order determining. Hence (&,M) is a quantum logic. Identifying closed subspaces with their orthogonal projections, an observable may be thought of as a projection-valued measure. Since, using the spectral theorem, there is a one-to-one correspondence between projection-valued measures and self-adjoint operators, we may identify observables with self- adjoint operators. Thus the quantum-logic approach, in this case, reduces to the classical approach of Section 2. It is straightforward to show that a,b<=@ are compatible if and only if their corresponding projections commute. It follows that two observables are compatible if and only if their corresponding self-adjoint operators commute. Let us now return to quantum logics. If x is an observable we call {x (E):Ee B(R)} the range of x. It is easily verified that the range of an observable is a Boolean <r-algebra. Lemma 4. (Varadarajan, 1962) Two propositions are compatible if and only if they are in the range of a single observable. This last lemma justifies the fact that compatible propositions are simultane- ously verifiable, since to verify two compatible propositions one need measure only a single observable. Now let x be an observable and let u : R -+ R be a Borel function. There is an operational significance for u (x); namely, if x has the value A e R, then u (x) has the value w(A). This is equivalent to saying that the proposition 'w(x) has a value in EeB(R)' is the same as the proposition 'x has a value in u l (E)\ Motivated by this, we define u(x) as u(x){E) = x[u~\E)] for all E e B(R). It is easily checked that u(x) is an observable and that u(x)++x. Theorem 5. (Varadarajan, 1962) Two observables x, y are compatible if and only if there exists an observable z and Borel functions u, v such that x = u(z) and y = v(z). This last theorem shows that, physically, compatible observables are simul- taneously measurable (i.e. non-interfering) since to measure two compatible observables one need only measure a single observable. Space does not permit a comparison of the quantum logic approach to the algebraic approach of Section 3. However, we mention that it can be shown that the approaches are not equivalent. It can also be shown that the Segal algebra of Section 3 can be embedded in a weaker type of quantum logic than that considered here (Gudder and Boyce, 1970; Plymen, 1968). (B) Quantum Systems Although some illuminating and physically valuable results have been obtained in the study of quantum logics, their structure is mathematically so general that Gudder 265 they have not been particularly useful for concrete calculations. A quantum logic is so general that it is far from the concreteness of the Hilbert space of the classical approach. What is needed is something like the GNS construction of the algebraic approach. However, such a construction is impossible unless more axioms are imposed on the quantum logic. Such steps have been taken (Piron, 1964; Zierler, 1961; Varadarajan, 1968) and theorems have been found which represent the propositions of certain types of restricted quantum logics as closed subspaces of a Hilbert space. However, many of the additional axioms do not have convincing physical justification. This point is, of course, arguable. The usual additional axioms are that & is a complete, atomic, semi-modular lattice. There is another approach which does bring the Hilbert space forward without imposing additional artificial axioms on the quantum logic {0>, M). This is to adjoin physical structures to (^,M) such as physical space, position observables and symmetry. After all, in the known physical systems there is always more than just the quantum logic. There is a space in which the system lives, usually some sort of symmetry involved and some kind of distinguished observable such as position. We now briefly explore this approach. First of all, any physical system concerns a phenomenon that takes place in some kind of physical arena which we call physical space. Mathematically, we shall assume that physical space & is a locally compact Hausdorff space with second countability. (We include these mathematical esoterics for preciseness. For the definitions of these terms see any book on topology or the reader can assume y=R 3 which is general enough for many discussions and which is the prototype of such spaces.) In a concrete physical situation, £f might be R , or R", or perhaps four-dimensional space-time, or some region in these spaces. Now many of the propositions in *3> are concerned with the location of the physical system in Sf. If such propositions can be verified in the laboratory we call the system localizable. We shall now define this term mathematically. Let B{Sf) denote the Borel sets in Sf, and if E e B{$f) let the proposition that the physical system is located in E be denoted by X(E). Thus A" is a map from B(Sf) into 0>. It is clear that X is an observable based on Sf so (1). X(9) = U (2). If Ef\F= 0, then X(E)±X(F); (3). X(\JE,) =VX(E t ) if E, ClE, = 0,i* j. We require that X be maximal in ascertain sense. Specifically, let R{X) £ & denote the range of X and let w be a probability measure with domain R (X). We say that X is maximal if every probability measure w on R (X) has a unique extension weJt. We say that a physical system is localizable if there exists a maximal observable (called a position observable) X:B{Sf) -* $P. There are physical systems that are not localizable. However, as indicated by the work of Jauch and Piron (1967) many of these systems can be handled using a weaker notion of position observable. In this section we shall henceforth only consider localizable systems. 266 Uncertainty Principle and Foundations of Quantum Mechanics We now consider symmetries. A symmetry may be thought of intuitively as being a transformation that maps the system into another system which is physically identical with the original one except for a relabelling. IfaeP, then after a symmetry transformation we get a new proposition Wa. Thus a symmetry induces a map W: & -> &. Since W just relabels the propositions, W is a bijection on 9 that preserves all the operations on 9>; that is, W is an automorphism on 0>. We denote the automorphisms on $> by aut (9) and notice that aut (9) is a group. Usually symmetries come from transformations on the physical space &. We say that a group G is a transformation group on & if G is a locally compact topological group with second countability for which there exists a continuous map from G x 9> onto & denoted by (g, s) -* gs such that (1). s -» gs is a homeomorphism of Sf with itself for every geG; (2). gi(g 2 (*)) = (gi • gi)(s) for every g lt g 2 e G; (3). if si, 5 2 e 5", there exists g g G such that s t = gs 2 (transitivity); (4). gs = s for every s 6 & if and only if g = c, the identity element of G (effectiveness). Now if a transformation group is a symmetry for the system it must induce an automorphism group on 9. Let if = (3 s , J<) be a quantum logic, Sf a physical space and X a position observable. A symmetry group on (if, $f, X) is a pair <S = (G,W) where G is a transformation group on y and W is a group homomorphism W: G -► aut (0>) (i.e. W glg2 = W gl W g2 ) such that (Wl). g -» m( W g (a)) is continuous for every meM,aed>; (W2). AT(g£) = W g (X(£)) for every g g G, E g B(50 (covariance). Condition (Wl) is a natural continuity requirement while (W2) is a covariant condition which gives the natural interpretation that W g (X(E)) is the proposi- tion that the system is located in the set gE. We call W a projective representation of G in d>. We thus see that g-> W g gives a generalization of a continuous unitary representation of a group and (W2) generalizes Mackey's irnprimitivity relation (Mackey, 1968). This completes the background for our extended axiomatic structure. We shall call a four-tuple (if, V, X, <S) where 2 = (9, M) is a quantum logic, Sf a physical space, X a position observable and <3 = (G,W)a symmetry group, a quantum system. We take the viewpoint that the important physical properties of a physical system are described by a quantum system. Let us consider an example. This is the usual formulation of a spinless, non-relativistic particle moving in one-dimensional space. The set of proposi- tions 9 is the lattice of closed subspaces (or equivalently, the lattice of orthogonal projections) of the complex Hilbert space L 2 (R, n) where n is Lebesque measure on R. Let M be the set of pure states of the form m f (P) = (f,Pf) where feL 2 (R,n),\\f\\ = 1 and />0. Then M is an order- determining set of states and (0>, M)'\s& quantum logic. Let G be the group of translations on R ; that is, for a e R, A -> A + a is a transformation group on R. Gudder 267 Let U a :L 2 (R,ti)^L 2 (R,fji.) be the map (U a f)(\) = f(A -a). Then U a is a unitary operator and if we define W a P=U a PLT a 1 for every Pe&, then W a g aut (2P) and (G, WO is a symmetry group. The position observable is given by (X(E)f) (\) = Xe (A )/(A ) where xe is the characteristic function of E g B (R ) . We now show that X is maximal. Let v be a probability measure on R (X) and define v (E) = v(X(E)),EzB(R). Then v is a measure on B(R) that is absolutely continuous relative to n (i.e. fi(E) = implies v (E) = 0). Hence by the Radon-Nikodym theorem there is a unique feL 1 (R,fj.),f^0 such that p (E) = \ E fdfi for every E eB(R). Let g =/ 1/2 so that v (E) = f g 2 dfi = f ;feg 2 d M = (g, *(£< )g> Then m g eJ and since m g (X(E)) = v(X(E)) for every EeB(R),wesee that m g is the unique extension of v in M. This last example is canonical in a certain respect. We shall show that corresponding to any quantum system (if, y, X, <S) there is an underlying Hilbert space and constructs similar to those in the above example that mirror much of the axiomatic structure of (2, Sf, X, < S). A cr-finite measure fi on B(S) is quasi -invariant relative to G if fi(E) = if and only if fi (gE) = for every g&G. Lemma 6. (Gudder, 1973c) Let (2, Sf, X, <$) be a quantum system. Then there is a non-zero o--finite quasi-invariant measure /j, on B(50 such that for every m g M the measure E-* m(X(E)) is absolutely continuous relative to ft. The space L 2 (F, fi) will serve as the underlying Hilbert space where F is a certain subset of Sf. Two states m t and m 2 are orthogonal if there is a g 9 such that mx(a) = m 2 (a') = 0. A set of vectors H is said to generate a Hilbert space H if the closed linear hull of H is H. Theorem 7. (Gudder, 1973c) Let (2, 3>, X, &) be a quantum system. Then there exists FeB(&) with the following properties: (a) fi(F)^0, (b) if Ee B(F) and m(-E) # 0, then X(£) 5^ 0, (c) there is a one-to-one map m^m from M onto a generating set H in the complex Hilbert space L 2 (F, /x) that preserves orthogonality. We shall now see that the Hilbert space L 2 (F, fj,) derived in Theorem 7 mirrors many of the structural properties of the quantum system (if, y, X, C S). Let us first consider the symmetry group C S = (G, W). Let Vo be the complex vector space generated by Ji={m:me M], where m -*■ m is the map given in Theorem 7. Now W g e aut (9) can be thought of as a map from M into M defined by (W g m)(a) = m(W g (a)),meM, a g0>,§gG. ^ Then W g induces a natural transformation W on Jt defined by Wgih = ( Wgm) ' . This map is well-defined since m -*■ m is injective. We next extend W g to Vq by linearity. If g e G define ju, g by n g (E) = n[g~ 1 (E)] for every EeB(F). Then /u. g is absolutely continuous with respect to fi. Let d/j. g /d/j. be the Radon-Nikodym derivative. 268 Uncertainty Principle and Foundations of Quantum Mechanics Theorem 8. (Gudder, 1973c) The map g+W g is a continuous unitary representation on V and (W g f)(\) =/(g _1 A)[cWdM(A)] 1/2 for every fe V . Now L 2 (F, fi) = V the closure of V and W g can be extended to a unitary transformation on L 2 (F, /x) which we also denote by W g . We thus see that the states M are represented by certain unit vectors M in the Hilbert space L 2 (F, fi ) and that the symmetry group $ is represented by a unitary representation W g of G on L\F, fi). We represent X on L 2 (F, n) by X(E) = proj. on the closed span of {m : m(X(E)) = 1} Theorem 9. (Gudder, 1973c) If (&, ST, X, <S) is a quantum system, then [X(E)f](\) = Gfefl(A), /e L 2 (F, fi). Denote the lattice of all orthogonaljjrojections on L 2 (F, fi) by #. We thus see that X is a position observable on §>. Now W g induces an automorphism on & defined by W*P = WgPW^ for every Pe&. Theorem 10. (Gudder, 1973c) If {££, Sf, X, <S) is a quantum system, then X[g(E)]= W*X(E) for all geG, EeB(F) and (G, W g ) forms a symmetry group on #. Letting 4" = (#, J) and # = (G, W g ) we see that the structure of a quantum system {<£,Sf,X,<0) is mirrored by the Hilbert space quantum system (£ # X, #). (C) Strengths and Weaknesses One of the strengths of the quantum-logic approach is that its axioms are simple and physically justified. This approach has contributed to the under- standing of many quantum-mechanical concepts. However, one of its weaknes- ses is that it is too general for use in concrete problems. The quantum systems studied above are an improvement but the representation of a quantum system {% Sf, X, <S) as a Hilbert space quantum system (.£, 5?, X, % given above is not completely satisfactory for the following two reasons. Except for the proposi- tions in the range of X, there is no isomorphism between & and §> so the propositions in general are not represented by & Second, there is no provision for distinguishing between pure and mixed states since all the states in M are mapped onto pure states in M. 5. THE CONVEXITY APPROACH In this approach the states are taken as the undefined primitive axiomatic elements. The important property of states, as far as this approach is con- cerned, is that they are closed under the formation of convex mixtures. Now it is easy to define a convex combination of elements in a linear space. However, the linear space is artificial and devoid of physical meaning for states. One cannot Gudder 269 add states or multiply them by scalars to get other states. Only the operation of forming convex combinations of states has meaning. For this reason an abstract definition of convex mixtures is defined that is independent of the concept of linearity. This approach to convexity originated with Stone (1949) and von Neumann and Morgenstern (1944) and later developed by Mielnik (1968, 1969), Ludwig (1968), Davies and Lewis (1970) and others. (A) Convex Structures We begin with a framework due to Noll and Cain (1974). Let S be the set of states for a physical system. We would like to define a notion of mixing a finite number of elements of S according to a given recipe. These recipes are described by listing the finite number of elements that are to be mixed together with the proportion of each element. Thus a recipe can be thought of as a function / : 5 -* [0, 1] such that (1). f(s) = except for finitely many s's; (2). £.*/(*) = 1 If we define the support supp / of a function / to be the set on which / does not vanish, then condition (1) is the same as supp/ is finite. We define the simplex AS of S to be the set of recipes on S. There is a natural map 8:S->AS whose values S s are given by S s (t) = 1 if t - s and S s (?) = if t # s. Thus 8 S is the characteristic function of the singleton set {s}. Notice that every recipe has the form /=Z"=i A A,, where A, ; >0, £" =1 A, = 1. Furthermore, there is a natural map T : AA5 -*■ AS given by T(F) = £ /6 As F(f)f. Finally, with every map M:AS^>S we can associate, in a natural way, a corresponding map M: AAS-> AS defined as M(F)(s) = I {F(f) :/e AS, M{f) = s) We say that M : AS-+S is a mixing operation for S if M is surjective and satisfies M°T = M°M. A less concise but more illuminating way of writing this last equation is M(lA,/;) = M(ZA,-5 M(/() ) (4) when ^ € AS, A, > 0, 1 A, = 1. We can interpret M as follows. If fe AS is a recipe then M(f) is the state resulting from mixing the s e S in the relative amounts f(s) prescribed by the recipe. Intuitively M(f) means mix the states according to the recipe /. Condition (4) means that if we mix a set of mixtures, we obtain the same result as when we apply each mixture individually and then mix. It is instructive to see that the usual notion of a mixture in a linear space satisfies (4). Suppose then that S is a real vector space. Then, in this case, a mixing operation M:AS-*S should satisfy M(£ A(5 Si ) = X AjS, (A, > 0, £ A,. = 1), or more generally M (X A,/-) = £ A*Af (/•) and M(S Si ) = s t . But then M£ k,f t ) = Z AW,) = I AMW = M£ A,5 M(/i) ) 270 Uncertainty Principle and Foundations oi Quantum Mechanics Let us now return to the general case. It follows from condition (4) that if M is a mixing operation, then M(f) = M(S M(f) ) for every /eAS.Since M is surjective, given seS there is /e A5 such that M{f) = s. Hence M(S s ) = s for every seS. This is interpreted as meaning that a mixture for a recipe containing one ingredient is identical to that ingredient. Let M be a mixing operation on 5. We define a map from [0, 1] x S x S into 5, (A, 5, /)~<A, s, t) as follows: (A, 5, t) = M[X8 S + (1 -A)5,]. We can interpret <A, s, t) as a mixture of the states s and t in the ratio A : (1 - A). The following lemma lists the important properties of <A, s, t). This lemma is proved by a straightforward application of the definition. Lemma 11. If M is a mixing operation, the map (A, s, t)^>(X, s, t) satisfies the following conditions: (Ml). (l,s,t) = s; (M2). <A, s,s) = s; (M3). (\,s,t) = (l-\,t,s); ,_, v v ,_ (M4). <A,s,</*,f,t;» = <A+(l-A)At,<A[A+(l-A)/i] Sm),*) whenever A/a^O. We call a map (A, 5, t)^(X,s, t) satisfying (M1)-(M4) a binary mixing operation. The next theorem shows that any mixing operation can be obtained by successive applications of a binary mixing operation. Theorem 12. (Noll and Cain, 1974) If (A, s, f)-*<A, s, t) is a binary mixing operation, then there exists a unique mixing operation M such that (A, s, t) = M[XS S + (1 - A )S,]. Furthermore M can be obtained by a successive application of<A,M>. Because of Theorem 12 we can work exclusively with the binary mixing operation (X, s, t)^>(X, s, t) and we shall do so in the following. A mixing operation is distinguishing if the corresponding binary mixing operation satisfies (M5). If <A, s, h) = <A, s, t 2 ) for some s e S and A * 1, then h = t 2 . Distinguishability is a reasonable physical condition which we shall later see is equivalent to having enough observables to distinguish between states. We call a set with a distinguishing mixing operation a convex structure. The axiom of this approach is the following. Axiom. The set of states for a physical system form a convex structure. The standard example of a convex structure is a real vector space V in which (X,s,t) = Xs + (l-X)t. The reader can easily check that this gives a convex structure. When we consider convex sets in vector spaces we always assume they are equipped with the above convex structure. It is convenient to also consider a framework which is much more general than a convex structure . A convex prestructure is a set 5 together with a function (A, s, f)>-»<A, s, t) from [0, 1] x S x S into S. This concept is so general that any Gudder 271 non-empty set is a convex prestructure. This is because no conditions are placed upon the map (A, s, t)>-+(X, s, t). If Si and S 2 are convex prestructures, a map A : 5i -* S 2 is affine if A(X, s, f)i = <A, As, At) 2 for all A e [0, 1], s, t e Si. We say that S x and S 2 are isomorphic if there is an affine bijection from S t to S 2 . An affine functional f is an affine map from a convex prestructure S to the real line R; that is /((A, s, t)) = Xf(s) + (1 - A )f(t) for all A 6 [0, 1], s, t in S. We denote the set of affine functionals on S by S* and say that S* is total if for any s,teS with s # t there exists /e S* such that f(s)^f(t). Suppose S is a convex prestructure that corresponds to the set of states for some physical system. Since a bounded observable has an expectation in every state, the bounded observables can be thought of as functionals on S. It is also physically reasonable that these functionals are affine. Furthermore, since a state is determined by the expectation values it gives to bounded observables it is reasonable to assume that S* is total. Theorem 13. A convex prestructure S is a convex structure if and only if S* is total. Proof. For sufficiency it is a simple matter to show that conditions (M1)-(M5) hold if S* is total. For example, for (Ml), since /«1, s, t)) =f(s) for all/e S* we have (1, s, t) = s. Necessity will follow from the second representation theorem proved later. A convex substructure of a convex structure S is a subset Si£S which satisfies <A, s, t) e S t whenever s, t e S u X e [0, 1]. A subset F s S is a face if F is a convex substructure and if (A. s,t)eF for some A e (0, 1) implies s,teF. An element s € S is an extreme point if {s} is a face. Thus s is an extreme point if and only if s is not a mixture of other elements. (B) Representation Theorems In this section we give two vector space representation theorems for convex structures. But first we need some definitions concerning convex sets. Let S be a convex subset of a real vector space V (i.e. x,yeS implies Xx + (1 - A )y e S for every A e [0, 1]). The hyperplane, cone and subspace, respectively, generated by S are defined as follows: H(S ) = I £ XtX, , : £ A, = 1, x, e S \ K(S ) = {t A^AiXJ.JceSo} V(S ) = {t A*x,: A, €£,*,€ So} 272 Uncertainty Principle and Foundations of Quantum Mechanics Two vector spaces are isomorphic if there exists a linear bijection from one to the other. We first consider the question of uniqueness of representations. Theorem 14. (Uniqueness) Let S be a convex prestructure and let 7\ and T 2 be affine bijections from X onto convex subsets 7\(S), T 2 (S) of two real vector spaces Vj and V 2 , respectively. If £ H(Tt(S)) and £ H(T 2 (S)) then V(7\(S)) and V(T 2 (S)) are isomorphic. Proof. The function g = T 2 ° 7T 1 is an affine bijection from T^S) to T 2 (S). We first extend g to K^S)). First, if yeX(r,(S)) then y =LA,x,, A,>0, x t e US). Hence y = I, A, I, (A,/I ; . A ; )x, = Ax where A > and x e 7\(S).We now show that the representation y = Ax is unique. Indeed, suppose y-Xx-fiz where X,fi>0 and x^eT^S). Then if X*n we have = Ax-/iz = (A-/t)[A(A-/t)" 1 x-/*(A-/*r 1 z]. Since 0£H(7\(S)) the second factor on the right-hand side is not 0. Hence A = /u, which is a contradiction. Thus A - ^ and hence x = z. Define g(y) = Ag(x). It is easy to see that the extended g : KiT^S)) -> tf(T 2 (S)) is a bijection. The following shows that g is additive on K(Tr(S)). g(Ax+Aty) = g{(A+^)[<A+At) a x+/t(A+/t) y]} = Ag(x)+/ig(y) = g(Ax) + gOuy) Also, g is homogeneous on K^T^S)) since g(A (/ue)) = g(A*a) = Ajtg(x) = Xg(fix) for \,n>0,xe Ti(S). We now extend g to V^S)). Suppose y e V(Ti(5)) and y = I A;X„ A, e J?, x,- € Tt(S). Then the positive and negative coefficients can be grouped so that y = Xx-y.z where A, fi s= and x, z e 7\(S). Thus y has the form y = «-o where ii.ee^r^)). Define g on V(Ti(5)) by g(y) = g(u)-g(v). This extended g is well-defined since if u-v = u 1 -v l ,u 1 ,Vie K(Ti(S)), then « +u x = Ui + v so by the additivity of g on X(ri(S)) we have g(u) + g(v l ) = g(u 1 )+g(v)andg(u)-g(v) = g{u 1 )-g(v l ).Thatg:V(T 1 (S))-* V(T 2 (5)) is linear and bijective is now easily verified. We say that a convex prestructure 5 is represented as a convex set S in a real vector space V if there exists an affine bijection T : S ■* S , with & H(S ) and V(S \= V. It follows from the uniqueness theorem that if S is represented as convex sets S u S 2 in vector spaces V x and V 2 , respectively, then V t and V 2 are isomorphic. Thus representative convex sets and their vector spaces are unique up to an isomorphism. Furthermore, if / e 5*, then/ ° T l is an affine functional on 5 and by a method similar to that used in the proof of Theorem 14, f°'T~ has a unique extension into a linear functional plus a constant on V(S ). Theorem 15. (First Representation Theorem) A convex prestructure S can be represented as a convex set 5 if and only if 5* is total. Proof. Let F : S -*■ S be an affine bijection where 5 is a convex subset of V. It is well-known that the set of linear functionals V* on V are total over V. Gudder 273 Restricting the elements of V* to S we get a total set of affine functionals for S . Now if / e V*, then / ° F e S* so S* is total. Conversely, suppose S* is total. For xeS define J(x) : S* -*■ R by J(x)f=f(x). Clearly 5* is a vector space under pointwise operations and /(x) e S** so that J(S) c S**. Now 7(5) is a convex set since for J(x), J(y) 6 7(5) and A e [0, 1] we have /e 5*, [A/(x) + (l-A)/(y)y=A/(x) + (l-A)/(y)=/«A,x,y)) = /«A,x,y»/ so A/(x) + (l-A)/(y)e/(5). To show / is injective suppose x^y&S. Then since 5* is total there is/e 5* such that/(x) #/(y) so /(x) # J(y). We now show that 1 H(J(S)). If € //(/($)) then there exist A, -e R, x, e S, i = 1 , . . . , n, with IA, = 1 such that lAi/(x,) = 0. Then lA,/(x,) = for every feS*. Letting /i = lwe obtain the contradiction £ A, = 0. Now let 5 be a convex structure. If S* is total the last theorem represented S as a convex set in S**. We now give a different representation which although isomorphic to the last one by the uniqueness theorem, has a form that is useful in many applications. A cone is a set K = {X, Y,Z,.. .} on which there is a binary operation (X, Y)*-*X+ Y and a scalar multiplication (A, X)^>\X, A e R + (i.e. A >0), X, Ye K satisfying: (1). X+Y=Y+X; (2). X+(Y+Z) = (X+Y)+Z; (3). if X+y=AT+Z,then Y = Z; (4). \(X+Y) = \X+\Y; (5). (X+/x)X = XX+fiX; (6). \((iX) = (*tJi)X; (7). l-X = X Theorem 16. (Second Representation Theorem) A convex structure S can be represented as a convex set. Proof. We first show that S can be extended to a cone. Let P = {(A, x):A >0, xeS}. We define addition and scalar multiplication on P by (A,x) + (ji, y) = (A+/u.)(A(A+/i) _1 ,x, y) and A(/u,x) = (A/i,x). A straightfor- ward verification using the properties of a convex structure shows that P is a cone. We next show that P can be extended to a vector space. Let V = {(X, Y) : X, Y 6 P}. Define the relation (X, Y) ~ (X, Y) if and only if X+ Y' = y+A"'. This is easily seen to be an equivalence relation onPxP. Denote the equivalence class containing (X, Y)by[(X, Y)] and let V={[X, Y)]:X, YeP}. Define addition on Vby[(AT, Y)]+[(X, Y')] = [(X+X, Y+Y')l To show this operation is well-defined, suppose (X, Y)~(X U Yi) and (X, Y')~{X' U Y\). Then X+Y^Y+Xi and X'+Y^Y'+Xl Hence X+X'+Y 1 +Y[ = Y+ Y' +X 1 +X[ and (X+X, Y+ Y) ~ {X x +X[, y x + Yi). Under addition V is an abelian group with zero [{X, X)]. Define a scalar multiplication by real numbers as follows. If A >0, then A [(AT, Y)] = [(XX, AY)]; if A=0, then A [(AT, Y)] = [(X,X)]; and if A <0, then \[(X, Y)] = [(-A Y, -XX)]. As with addition, this operation is well-defined. It is straightforward to show that Vis a 274 Uncertainty Principle and Foundations of Quantum Mechanics vector space. Now define the maps A : S^P and B :P^> V by Ax = (1, x) and BX = [(X+Y,Y)]. The second map is well-defined since (X+Y, Y)~ (X+Z, Z) for every Y, Z e P. Hence B° A : 5 -* V. Now B is additive since B(X+Z) = [(X+Z+Y, Y)] = [(X+Z + 2Y,2Y)] = [(X+Y, Y)]+[(Z+Y, Y)] = BX+BZ. Also B is homogeneous since for A > 0, B(\X) = [(XX+Y, Y)1 = [(\X+\Y,\Y)] = \[(X+Y, Y)] = XBX Furthermore, A is an affine map since A«A,Jt,y» = (l,<A,x,y» = (A+(l-A),<A,x,y» = (A,x) + (l-A,y) = A(l,*) + (l-A)(l, Y) = AAc + (l-A)Ay It follows that B ° A :5-> V is affine. It is easily checked that A and B are injective so B ° A is injective. Also it is clear that B ° A (5) is convex and that V[B o A (S)] = V. Finally, suppose e #[B ° A (5)]. Then there exist X,, Z, 6 P and A, e R with I A, = 1 such that £ A,[(Xi +Z„ Z,)] = 0. Combining the posi- tive coefficients and negative coefficients, there exist A, p. >0, A ^ p, x, y e S, ZeP such that [((A,x)+Z,Z)] = [((p, y)+Z,Z)]. This implies that (A,x) = (/i, y). But then A = p which is a contradiction. A distance can be defined in a very natural way in a convex structure 5. If x, y e S, the closeness of x to y can be measured by comparing mixtures <A, x u x), (A, yi, y) of 5. If x and y are very close we would expect to find a mixture containing mostly x equal to a mixture containing mostly y; that is, <A,x 1 ,x) = (A,y 1 ,y) in which A is very small. Conversely, if (X,x u x) = <A, y i, y > and A is small we expect that x and y are close. Thus the parameters A such that (A, x u x) = (A, y u y) give a measure of the closeness of x and y. We thus define a distance function a as follows: o-(x,y) = inf{O^A^l:(A,Xi,x) = (A,y 1 ,y),x 1 ,y 1 eS} Notice that since <§, x,y) = {\,y,x) we have < a(x, y ) < \ for all x, y e S. It is sometimes useful to make a change of scale and define the distance function p( x> y ) = a (x, y)[l-<r(x, y)] _1 . Then 0<p(x, y) ^ 1 for every x,yeS. Using a representation of 5 it is straightforward to show that a and p are metrics. One of the important properties of a and p is that they are invariant under isomorphisms. That is, if A:S!->5 2 is an isomorphism, then a 2 (Ax, Ay) = o^x, y) and Pl (Ax, Ay) = p,(x, y) for all x, y e Sj. There is also a relationship between p and transition probabilities. One might expect, since 0=£p(x, y)^ 1, that p has something to do with probabilities. Specifically if p(x, y) is small one might expect the transition probability from x to y to be large while for large p(x, y) a transition from x to y would be unlikely. This is indeed the case. In fact, in the classical approach, if <t> and i// are unit vectors Gudder 275 corresponding to pure states x and y, it can be show that the transition probability \(<j>, <A)1 2 = l—p(x, y) 2 . For more details and other results the reader is referred to (Gudder, 1973a, b). We next briefly show how this approach can be carried further. Let 5 be the convex structure of states. Then by one of the representation theorems, S can be represented by a convex set 5 in a real vector space V= V(S ). The metricp on S can be transferred to 5 giving a metric p on 5 . It can be shown that there exists a unique norm ||-|| on V such that ||jk - y || = p (x, y) for every x, y e 5 . In one interpretation the states are thought of as 'unit beams' and the cone P = {(A, x) : A > 0, x € 5} of the second representation theorem is the space of beams. The functional t :P-+R + defined by t [(A, x)] = A is interpreted as giving the beam intensity. It is easy to see that r has a unique extension to a linear functional r on V and that t(X) = |WI f° r every XeP. The triple ( V, P, t) is called a base normed space and is the basic framework for the operational quantum mechanics of Davies and Lewis (1970). (C) Strengths and Weaknesses The strengths and weaknesses of this approach are similar to those of the quantum-logic approach. The axioms are simple and physically motivated. Although the approach has important theoretical uses, its practical utility has not been exploited. An important unsolved problem in this respect is to characterize convex structures that are isomorphic to the set of density operators on a Hilbert space. REFERENCES Davies, E. B. and Lewis, J. T. (1970) 'An operational approach to quantum probability', Commun. Math. Phys., 17, 239-260. Dirac, P. A. M. (1930) The Principles of Quantum Mechanics, Clarendon Press, Oxford. Emch, G. G. (1972) Algebraic Methods in Statistical Mechanics and Quantum Field Theory, Wiley-Interscience, New York. Gelfand, I. and Naimark, M. A. (1943) 'On the imbedding of normed rings in the ring of operators in Hilbert space', Mat. Sb.N.S., 12 [54], 197-217. Gleason, A. M. (1957) 'Measures on the closed subspaces of a Hilbert space', /. Math. Mech., 6, 885-894. Gudder, S. (1973a) 'State automorphisms in axiomatic quantum mechanics', Intern. J. Theoret. Phys., 7,205-211. Gudder, S. (1973b) 'Convex structures and operational quantum mechanics', Commun. Math. Phys., 29, 249-264. Gudder, S. (1973c) 'Qnantum logics, physical space, position observables and symmetry', Rep. Math. Phys., 4, 193-202. Gudder, S. and Boyce, S. (1970) 'A comparison of the Mackey and Segal models for quantum mechanics', Intern. J. Theoret. Phys., 3, 7-21. Haag, R. and Kastler, D. (1964) 'An algebraic approach to quantum field theory,' /. Math. Phys., S, 848-861. Jauch, J. (1968) Foundations of Quantum Mechanics, Addison Wesley, Reading, Mass. Jauch, J. and Piron, C. (1967) 'Generalized localizability', Helv. Phys. Acta, 40, 559-570. 276 Uncertainty Principle and Foundations of Quantum Mechanics Jordan, P., von Neumann, J. and Wigner, E. (1934) 'On an algebraic generalization of the quantum mechanical formalism', Ann. Math., 35, 29-64. Kolmogorov, A. N. (1956) Foundations of the Theory of Probability, Chelsea, New York. Ludwig, G. (1968) 'Attempt of an axiomatic foundation of quantum mechanics and more general theories IIP, Commun. Math. Phys., 9, 1-12. Mackey, G. W. (1963) The Mathematical Foundations of Quantum Mechanics, W. A. Benjamin Inc., New York. . Mackey, G. W. (1968) Induced Representations and Quantum Mechanics, W. A. Benjamin Inc., New York. . Maczynski, M. J. (1974) 'When the topology of an infinite-dimensional Banach space coincides with a Hilbert space topology', Studio Math., 49, 149-152. Mazur, S. and Ulam, S. (1932) 'Sur les transformations isometriques d'espace vectoriels normes', C.R. Acad. Sci. Paris, 194, 946-948. Mielnik, B. (1968) 'Geometry of quantum states', Commun. Math. Phys., 9, 55-80. Mielnik, B. (1969) 'Theory of filters', Commun. Math. Phys., 15, 1-46. Noll, W. and Cain, R. N. (1974) 'Convexity, mixing, colors, and quantum mechanics', Preprint: Department of Mathematics, Carnegie-Mellon University, Pittsburgh, Pa. Piron, C. (1964) 'Axiomatique quantique', Helv. Phys. Acta, 37, 439^468. Plymen, R. J. (1968) 'C*-algebras and Mackey's axioms', Commun. Math. Phys., 8, 132-146. Schatten, R. (1950) A Theory of Cross -Spaces, Ann. Math. Studies 26, Princeton University Press, Princeton, N.J. Segal, I. E. (1947) 'Postulates for general quantum mechanics', Ann. Math., 48, 930-948. Sherman, S. (1956) 'On Segal's postulates for general quantum mechanics', Ann. Math., 64, Sikorski, R. (1949) 'On the inducing homomorphisms by mappings', Fund. Math., 36, 7-22. Stone, M. H. (1930) 'linear transformations in Hilbert space III. Operational methods and group theory', Proc. Nat. Acad. Sci. U.S.A., 16, 172-175. Stone, M. H. (1949) 'Postulates for the barycentric calculus', Ann. Mat. PuraAppl., (4) 29, 25-30. Varad'arajan, V. S. (1962) 'Probability in physics and a theorem on simultaneous observability', Commun. Pure Appl. Math., 15, 189-217. Varadarajan, V. S. (1968) Geometry of Quantum Theory I, Van Nostrand, Princeton, N.J. Von Neumann, J. (1931) 'Die Eindeutigkeit der Schrdingerschen Operatoren', Math. Ann., 104, 570-578. Von Neumann, J. (1932) Grundlagen der Quantenmechanik, Springer, Berlin; enghsh translation by R. T. Beyer, Princeton University Press, Princeton, N.J., 1955. Von Neumann, J. and Morgenstern, O. (1944) Theory of Games and Economic Behavior, Princeton University press, Princeton, N.J. Wigner, E. P. (1931) Gruppentheorie und ihre Anwendugn, Vieweg, Braunschweig; English translation by J. J. Griffin, Academic Press, New York, 1959. Zierler, N. (1961) 'Axioms for non-relativistic quantum mechanics', Pac. J. Math., 11, 1151-1169. n Intermediate Problems for Eigenvalues in Quantum Theory WILLIAM STENGER Ambassador College, Pasadena, U.S.A. INTRODUCTION In any general study of quantum theory one sooner or later becomes involved in a discussion of eigenvalue problems. In fact, eigenvalue problems provide not only a link with classical mechanics, but are actually, in a sense, typical of quantum mechanics even in classical problems. For instance, if we consider such classical problems as those of the vibrations of strings, membranes and plates and of the buckling of beams and plates, we immediately see quantum- like phenomena. The frequencies of vibration and buckling loads occur only at discrete numerical values. Modes of vibration and buckling 'jump' from one state to another. As a curiosity, one could say that these classical problems are more purely quantum-like than quantum problems in that the phenomenon of a continuous spectrum does not occur in classical cases. In all these problems, whether we consider frequencies of vibration, buckling loads or energy levels, the common ground is, of course, the eigenvalue problem. Therefore, it is not at all surprising that methods, techniques and theoretical results dealing with eigenvalue problems of classical mechanics can be carried over and applied to problems of quantum mechanics. Weinstein's methods of intermediate problems and their variants, to which the present chapter is devoted, are particularly exemplary of this kind of development. 1. DEFINITIONS AND NOTATIONS Let § be a real or complex Hilbert space having the scalar product (u, v) and let H be a self -adjoint linear operator denned on a subspace 3) dense in &. In problems discussed here, H is bounded below and the lower part of its spectrum consists of a finite or infinite number of isolated eigenvalues A x ^ A 2 ^ . . . each having finite multiplicity. Let \ x denote the lowest point (if any) in the 277 278 Uncertainty Principle and Foundations of Quantum Mechanics essential spectrum of H. The point A«, could be a non-isolated eigenvalue of finite or infinite multiplicity, an isolated eigenvalue of infinite multiplicity or a spectral point which is not an eigenvalue. There may be point eigenvalues, even isolated point eigenvalues, which are above A*,. However, when we enumerate the eigenvalues A u A 2 . . . , we mean the isolated eigenvalues that are below A*,. We shall denote by u u u 2 . . ■ a corresponding orthonormal sequence of eigenfunctions. The selection of operators having these properties is motivated by the fact that many problems in classical and quantum mechanics involve operators of this type. Since the Schrodinger operators for hydrogen, helium, etc., have such spectra, we call this type of operator 'type-.?". 2. INTERMEDIATE PROBLEMS It is possible to solve exactly for the eigenvalues of only a few operators of type-y. In most cases one must devise methods of estimating the eigenvalues. Since an approximation is useless without also specifying its accuracy, the best approximations for eigenvalues have come by means of complementary methods, that is methods which approximate the eigenvalues from above and below. The Rayleigh-Ritz Method has been used widely to obtain approximations from above (upper bounds) to the eigenvalues of operators of type-^. This method is fairly straightforward to apply and with the advent of high-speed computers has given results of remarkable accuracy. For a detailed discussion of the Rayleigh-Ritz Method, see the books of Gould (1966) and Weinstein and Stenger (1972). The problem of finding lower bounds to eigenvalues is intrinsically much more difficult. The first major breakthrough in this area was made by Alexan- der Weinstein (1935, 1937) who introduced intermediate problems to deter- mine lower bounds to the buckling load and frequencies of vibration of a clamped square plate. In solving these problems Weinstein used classical techniques involving natural boundary conditions and Lagrange multipliers. Soon the problems were reformulated in the language of Hilbert space. Without going into detail, the basic idea of intermediate problems is as follows. Given an eigenvalue problem Hu =Au we first find another eigenvalue problem, called the base problem, Au =Aw whose eigenvalues are all lower than those of H. We then build a sequence of intermediate problems depending on a finite number of functions which link the base problem to the given problem and whose eigenvalues are intermediate between those of H and those of A. Finally, we must solve for the eigenvalues Stenger 279 of the intermediate problems, thereby obtaining lower bounds to the eigen- values of H. In the problems solved by Weinstein the given problem turned out to be of the form Hu=Au-PAu=ku, Pu = where P is the orthogonal projection operator onto a subspace 5g of £>. The base problem here is Au = Am. If we select a finite number of functions Pu Pi, ■ ■ ■ , p n from $£ and let P n be the orthogonal projection operator onto the subspace spanned by the functions p u i = 1, 2, . . . , n we can formulate the nth intermediate problem Au- P n Au = Am, P n u = which has eigenvalues intermediate between those of A and of H. This is called an intermediate problem of the first-type. The eigenvalues of the intermediate problems are obtained from the Weinstein determinant W(\) = det {(Rm, p k )}, i,k = l,2,...,n where i? A is the resolvent operator of A, i.e. i? A = (A -A/) -1 . " If no special assumptions are made on the choice of functions p„ it is possible that an eigenvalue of the base problem is also an eigenvalue of the intermediate problem. We call such eigenvalues persistent. Since the determinant W(A) may be singular at a persistent eigenvalue, in numerical applications a so-called 'big' Weinstein determinant is used which avoids possible singularities. An analog- ous situation occurs in problems of the second-type discussed later, see Weinstein and Stenger (1972) for details. Intermediate problems of the first type have been applied numerically to problems of classical mechanics and have also had theoretical applications, some of which are related to quantum theory, see for instance Stenger (1968). A complete discussion of the numerical and theoretical applications of the first type of intermediate problems is given in the book of Weinstein and Stenger (1972). A second -type of intermediate problems was introduced by Aronszajn (1951). The basic pattern here is the same as in intermediate problems of the first-type, although the form of the problems is somewhat different. In particu- lar, the given problem admits the decomposition Hu = Au+Bu = \u where A is of type-5^ and B is positive. The base problem is Au = \u and the nth intermediate problem is given by n n Au + £ Z ("» Bpi)(SijBpj = Au where {ft,} is the inverse matrix of {(Bp h p t )}. For a suitable choice of a, and q t 280 Uncertainty Principle and Foundations of Quantum Mechanics the intermediate problem can be written in the more general, yet simpler, form n Au + £ atj(u, qj)qj=ku 7 = 1 The eigenvalues of the intermediate problems are obtained from the Weinstein determinant V{k) = det {S ik + a,GR A (?„ q k )}, i,k = 1,2, ... ,n This determinant has also been called the modified Weinstein determinant and the Weinstein-Aronszajn determinant. Up to now all numerical applications of intermediate problems to quantum theory have involved problems of the second-type or variants of problems of the second-type, as is illustrated in subsequent sections. It should be mentioned here that while intermediate problems are not a part of perturbation theory, the solution of problems of the second-type has led to a number of contributions to perturbation theory, see for instance Kuroda (1961), Kato (1966), Stenger (1969), Weinstein and Stenger (1972) and Weinstein (1974). Several attempts have also been made to 'reduce' inter- mediate problems of the first-type to those of the second-type, e.g. Kuroda (1961), Fichera (1965) and Kato (1966). While such a reduction can be made in certain cases under severe limitations, the result in every such case actually leads to a more complicated problem than the original, see Stenger (1970a, b) and Weinstein and Stenger (1972). 3. BAZLEY'S APPLICATION TO THE HELIUM ATOM The first application of intermediate problems to quantum theory was given by Bazley (1959, 1960, 1961) who was then joined by Fox. Their collaboration, as well as individual research, produced many significant numerical and theoreti- cal results, e.g. Bazley and Fox (1961a, b; 1962a, b,c; 1963a, b; 1964; 1966a, b, c), Fox (1972). A more complete bibliography, a survey of their work and tables of numerical values are given in Weinstein and Stenger (1972). Bazley first considered the problem of estimating the eigenvalues of the Hamiltonian operator for the helium atom. We now give an overview of the application to helium, omitting details which may be found in the original papers and book cited above. If we neglect nuclear motion, relativistic effects and the influence of spin and denote by (x u y u z x ) and (x 2 , y 2 , z 2 ) the coordinates of the two electrons, the Schrodinger equation for helium is Hu = -^AiW -§A 2 « -(2/rJu -(2/r 2 )w +(l/r 12 )u = Xu where A, is the Laplacian in the coordinates (x h y„ z t ). r^tf + yl + zDK i = l,2 r 12 = [(x 2 -xi) 2 + (y2-yi) 2 + (z2-zi) 2 f Stenger 281 While the domain of definition of H from the point of view of the physicist was historically only vaguely defined, Kato (1951a, b) considered the operator H on the Hilbert space of square-integrable functions (i.e., if 2 space) over six-dimensional coordinate space. He showed that H admits there a unique self-adjoint extension, that is, that H is essentially self-adjoint. In other words he proved that the closure of H is self-adjoint, see also Kato (1966). Of course, ' the closure of H is no longer a differential operator in the usual sense, but it reduces to the differential operator for sufficiently regular functions. If we wanted to be notationally strict, we should use different symbols for the formal differential operator and the self-adjoint extension in Hilbert space. However, for the sake of this exposition we avoid encumbering the notation and use the same symbol to denote the formal operator and its corresponding Hilbert space operator. It is by no means a foregone conclusion that// is of type-y. It is therefore significant that Kato (1951a, b) showed that the spectrum of H begins with isolated eigenvalues, each of finite multiplicity. Following the general pattern of intermediate problems, we first find a suitable base problem. In the present case the base problem is Au = -%AtU -\L 2 u-{2/r l )u-(2/r 2 )u = Xu This operator A is the Hamiltonian of a system composed of two independent hydrogen-like atoms and admits a unique self-adjoint extension having the same domain as H. The eigenvalues and eigenfunctions of A are well known, see Kemble (1958). The eigenvalues are given by -2[(l/n 2 ) + (l/« 2 )], n u n 2 =l,2,... with multiplicities n\n\, and the corresponding eigenfunctions are products of hydrogenic wave functions. Since the continuous spectrum of A consists of the interval [-2, 00), the lower part of the spectrum begins with isolated eigen- values, that is to say, A is also of type-5^. If we now decompose the given operator H as // = A +B, where B is the non-negative operator given by Bu = (l/r 12 )u we are in a position to form intermediate problems of the second-type. At this point one could attempt to solve intermediate problems with arbi- trary functions p,. Such an approach, however, would be fruitless since the resulting intermediate problems would not lend themselves to numerical solutions. In order to overcome this difficulty, Bazley introduced a special choice of functions which led to an algebraic problem, readily solvable by using computers. The special choice of Bazley in certain respects parallels the distinguished choice used earlier by Weinstein in problems of the first-type. 282 Uncertainty Principle and Foundations of Quantum Mechanics In order to form the special choice, we let u < 0) denote the (known) eigenf unc- tions of the base problem Au = Au. We choose vectors p t such that Bpi = uf\ i = 1, 2, .... » In this way, the nth intermediate problem becomes Au+l £ft y («,«?>P = Aw ,=1 i-1 where {)S /; } is matrix inverse to {(Bp h Pj)}. In the case of helium the spectrum of A begins with isolated eigenvalues Af = -2[1 + (1A 2 )], k = l,2,.-. having eigenfunctions «f = -2(l/4ir)R io(ri)*io(r2) uf = (l/v / 24ir)[i? 1 o(ri) J R fc o('-2) +^io(r 2 )^fco('-i)], k=2,3,... where the elements R k0 axe the normalized hydrogen radial wave functions. By observing that B is easily invertible and yields the special choice A = ri2M (°> (j = l,2,3), Bazley solved the third intermediate problem and obtained the lower bounds -3.063^ EtfS) and -2.165 5 <E(2 l S) for the S-states of parahelium. Bazley obtained an improved lower bound for £(l x 5) by using the lower bound for E{2 1 S) in Temple's formula. This lower bound combined with the Rayleigh-Ritz upper bound computed by Kinoshita (1957) gives the quite accurate estimate -2.9037474<£(1 1 5)< -2.9037237. While we have concentrated our attention here on the first and second eigenvalues, it should be noted that intermediate problems may be used to obtain lower bounds to an arbitrary number of eigenvalues at the lower part of the spectrum. 4. TRUNCATION OF THE BASE OPERATOR In many numerical applications, if we take the most obvious or natural base problem and attempt to solve intermediate problems relative to that base problem, we are confronted with transcendental equations which cannot be readily solved by numerical means. However, these computational difficulties may be circumvented by the following truncation of the base operator. The idea of truncating the base operator was introduced by Weinberger (1959) in problems of the first-type and was later developed by Bazley and Fox (1961b) in problems of the second-type where it was successfully applied to problems of quantum theory. >m Stenger 283 We begin by considering the spectral representation of the original base operator A, namely Au = 1 Aj 0) («, uf >) )u ( ° ) + f A &E k u i JAco-0 where the sigma may denote a finite sum or an infinite series. For a fixed positive integer N, we define the truncation operator of order N by T N u = I kf\u, uf^uf+k^ f °° dE k u We assume without loss of generality that A $' < A n+i- The truncation operator is generally simpler than the original base operator since it consists of a negative-semidefinite operator of finite rank plus a multiple of the identity. Such an operator has only a finite number of distinct eigenvalues and no continuous spectrum. The advantage gained by using the truncation operator is that the resolvent of T N has the form Rx(T N )p< » (p,uW i k aP-a j (0) -1 — A L ; = i J Therefore, in the intermediate problems the resulting Weinstein determinant is a rational function instead of a (generally) transcendental function, thus reducing the difficulty of numerically determining the eigenvalues of the intermediate problems. Bazley and Fox (1961a) applied the method of truncation to the helium atom. Even with a truncation of order two and only the second intermediate problem they obtained an improvement of the lower bounds Bazley (1961) obtained by the special choice. In this case the truncated base problem is given by T 2 u=k?(u,uT)u,+kf{u,uf)uf +Af[«-(«,« ( 1 °Vf-(«,«n«f] = A« where \f\ A^ 0) , Af , uf } and «f are as given in Section 3. Letting Bu = (l/r 12 )w as before and choosing functions p 1 = [(1.5) 3 /^]e- 15 (r 1 +r 2 ) p 2 = [5 V5)/7r]r 12 exp [-(5) 1 (r t + r 2 )] Bazley and Fox were able to form an intermediate problem whose eigenvalues could be computed by hand. Another problem to which Bazley and Fox applied truncation was the radial Schrodinger equation. In order to solve this problem the given eigenvalue 284 Uncertainty Principle and Foundations of Quantum Mechanics problem -dVM* 2 - z[U - e- a *)/x]<A = £<A is transformed into another eigenvalue problem in the following way. While ordinarily one would consider E as the eigenvalue, here we fix the energy E and take the charge z as the eigenvalue. The numerical results may then be inverted to give the energy eigenvalues E. The reason for taking z as the eigenvalue is that the resulting base problem has a pure point spectrum. We put E = -k 2 and introduce the transformations ii(f) = rty(0 to obtain t = 2kx, _l(^ + f_ti M=A (l-e-^ 2 > dA dt) At where A = z/2k. We now have an eigenvalue problem of the form Au=k{\-B)u where Au d / dw\ t 2 + l __d_( d«\ ~~d/VdJ" At and Bu=t-" /2k u We note that the base operator A has known eigenvalues, Aj 0) = i, i = l,2,... and normalized eigenfunctions M <°> = (,l/;!/ 4 )LK0e-' /2 , ,- = 1,2,... where L\ is the first derivative of the ith Laguerre polynomial. The given problem has a pure point spectrum Ai<A 2 <. . . diverging to infinity and satisfying Af<A„ i = l,2,... Here the intermediate problems are of the form T N u = X(I-BP n )u where T N is the truncation operator previously defined and P" denotes a projection on functions p lt p 2 , ■ ■ ■ , p n , orthogonal with respect to the inner product [u, v] = (w, Bv). In solving this problem Bazley and Fox put p, = w, (;' = 1, 2, . . . , n) and obtained the eigenvalues as solutions of an algebraic system. ^ Stenger 285 For this example, the value of a was fixed and the substitution k = a/2 was made so that E = -a 2 /A. Upper bounds obtained by solving a fourth-order Rayleigh-Ritz problem based on the trial functions «i 0> , « 2 0) , "3 0) and uT together with the lower bounds obtained from the intermediate problems provided the estimates: 1.2587a < z x < 1.2590a 2.3944a <z 2 ^ 2.4 164a 3.4207a <z 3 < 3.5576a 5. AN ANHARMONIC OSCILLATOR Another variant in intermediate problems of the second-type namely, the generalized special choice, was used by Bazley and Fox (1961a) in estimating the eigenvalues of an anharmonic oscillator. Here the differential equation is given by -u"+x 2 u+ex*u=\u, -oo<jc<oo where e > 0. Restricting our discussion to the even symmetry class, the base problem Au = —u"+x 2 u =A« has well-known eigenvalues AP = 4/-3, 1,2,., The corresponding eigenfunctions are the linear oscillation eigenfunctions «i 0) = Q exp (-x 2 /2)H 2i - 2 (x), i = 1, 2, . . . Here Q = 2 1_i [(2/ -2)!] _i 7r _ * and H, is the /'th Hermite polynomial. Letting B be the operator defined by Bu = ex A u we once again have the given problem in the form Au+Bu = Aw We recall from Section 3 that a special choice of functions p, is given by Bpi = u - 0) . In this problem, however, Bazley and Fox introduced a generalized special choice given by 1=1 i = l,2, ,n It turns out that by putting p, = wf * and using a recurrence relation for Hermite polynomials that the symmetric matrix {/3 tf } is readily obtained. The eigen- 286 Uncertainty Principle and Foundations of Quantum Mechanics values of the intermediate problems are then computable as the roots of a linear system. For various values of e the intermediate problems yielded lower bounds to the first five eigenvalues, which complemented by the Rayleigh-Ritz upper bounds (also given by Bazley and Fox), demonstrated the accuracy of the method. In fact, when compared graphically, the results for the first eigenvalue computed by intermediate problems and computed by perturbation theory show the overwhelming superiority of intermediate problems over perturba- tion theory. The Rayleigh-Ritz upper bounds and intermediate lower bounds were actually indistinguishable on the graph used, while even for quite small values of e the perturbation theory values were not even close. 6. FOX'S APPLICATION TO THE LITHIUM ATOM One of the more significant advances in the applicability of intermediate problems to quantum mechanics was recently given by Fox (1972) who introduced a method of constructing intermediate operators (called compari- son operators in Fox's terminology) which make it possible to compute lower bounds for the eigenvalues of the Schrodinger operators for atoms and ions having three or more electrons. The basic pattern of given problem, base problem and intermediate prob- lems, is also in Fox's work. However, the actual form of the intermediate problems is new and fundamentally different from previously used inter- mediate problems of the first- and second-types. The difficulty in dealing with the Schrodinger equation of atoms more complicated than helium is that the lowest point in the essential spectrum of the base operator lies below (or very close to) the first eigenvalue of the given operator. In intermediate problems of the second-type the intermediate prob- lem is formed by adding an operator of finite rank to the base operator. Since the essential spectrum of the base operator is invariant under the addition of any compact operator, the eigenvalues of the intermediate problems would not provide meaningful numerical results. In order to overcome this difficulty Fox constructs intermediate problems by adding operators to the base operator which provide, even though these operators are non-compact, intermediate problems whose eigenvalues can be numerically determined. In order to accomplish this Fox used a technique of separation of variables. Moreover, he had to introduce and develop important results in the spectral theory for the separation of variables in Hilbert space, see Fox (1968, 1975). Let us now illustrate the general concepts of Fox's method. We again suppose that the given problem is of the form Hu = Au + Bu = \u Moreover, we assume thaJxA can be resolved by elementary separation of Stenger 287 variables and that the separation of variables used for A also allows B to be written as a certain sum relative to this separation of variables. A complete discussion of what such a decomposition involves would require us to go into some detail regarding tensor products of Hilbert spaces, see Fox (1968, 1972, 1975). Instead, for the purpose of the present chapter, we consider the specific problem of the Schrodinger equation for the non-relativistic fixed-nucleus model for the lithium atom without spin interaction. Here the given problem is ■-tAu-Q/rJu-Q/rJu Hu - (3/r 3 )u + (l/r 12 )u + (1 /r 13 )u + (l/r 23 )« = Aw where A is the nine-dimensional Laplacian, r, is the (Euclidean) distance from the nucleus to the /th electron (i = 1, 2, 3) and r tj is the distance between the /th electron and the /th electron. The base problem Au = -\ Au - (3//i)k - (3/r 2 )w - (3/r 3 )« = Ah factors (separation of variables) into three resolvable hydrogen-like operators, see Kemble (1958). In order to decompose B we consider the nine- dimensional coordinate system (x u y u z u x 2 , y 2 , z 2 , x 3 , y 3 , z 3 ) where (*,-, y„ z ( ) gives the position of the /th electron (/ = 1, 2, 3). We let & denote the if 2 space of functions defined on (x u y u z u x 2 , y 2 , z 2 , x 3 , y 3 , z 3 ), let &„ denote the £ 2 space of functions defined on (x t , y„ z t , x h y„ z f ), and let £>, denote the i? 2 space of functions defined on {x h y t , z,). If we now define B 12 to be multiplication by r7 2 in £>i 2 and let I 3 be the identity operator on £> 3 , we can form a tensor product operator B l2 = B 12 x I 3 which gives multiplication by r\\ in $. The operators B l3 and B 23 may be formed in an analogous manner. Now the operator B can be decomposed into B — Bi 2 +Bi 3 +B 23 which is the decomposition necessary for Fox to form intermediate problems and apply his method. Instead of approximating B by operators of finite rank, as in intermediate problems of the second type, here one approximates the operators 2?„ by operators of finite rank, say B%. While B" 2 is an approximation to B 12 of finite rank, the tensor product operator B" 2 = B " 2 x I 3 is a non -compact approxima- tion to B 12 . Similar non-compact approximations to B 13 and B 23 may be formed, say B" 3 and B 23 . The intermediate operator which is then given by A+B n 12 +B n 13 + m 3 consists of the base operator A plus non-compact operators. This means that the essential spectrum may be displaced and lower bounds obtained. 288 Uncertainty Principle and Foundations of Quantum Mechanics It should be noted that once the decomposition of B is achieved the intermediate operators are formed in ways similar to those in problems of the second type, that is, by using special choices, generalized special choices and truncation. These methods have been applied by Fox and Sigillito (1972a, b, c) to obtain bounds for the energy levels of radial lithium. The radial model is a simplifica- tion of the usual fixed-nucleus non-relativistic model based on the assumption that the electron distributions depend on the distances of the electrons from the nucleus only and not on the angular variables. This simplified model was used to test the methods numerically while avoiding the complexities of angular momenta. The Hamiltonian for radial lithium is given by H = -t (A£/2 + 3//i)+I l/p„ where A* is the radial Laplacian rfdr, ■A drj and pa = max [r„ />]. The operator acts on functions of the three radial distances that are square integrable with respect to the weight function r x r 2 r 3 . Here the base operator is the sum of three resolvable one-electron hy- drogenic Hamiltonians and the intermediate operators are formed by treating each pairwise coupling l/p v separately as a two-electron operator. Then, the resulting intermediate problems can be solved numerically by diagonalizing Hermitian matrices. For total spin 5 equal to \ and § upper bounds computed by the Rayleigh- Ritz method together with lower bounds obtained from the eigenvalues of the intermediate operators were given by Fox and Sigillito (1972b) and are reproduced in Table 1. Table 1. Bounds for energies of radial lithium S = \ -7.620 < A, £-7.488 -7.493 £A 2 £ -7.324 -7.457 <A,< -7.275 -5.220<A 1 =£-5.204 -5.169sA 2 <-5.149 -5.160<A 3 £-5.170 i Stenger 289 given in the table, shows that the method is indeed successful in displacing the essential spectrum, since the lowest points in the essential spectrum of the corresponding base operators are -9 and -5.625, respectively. Finally, we would like to mention that a similar approach to constructing intermediate operators for the lithium atom was published by Reid (1972). However, Reid's contribution appears to be purely formal and does not touch on the subtleties of separation of variables in Hilbert space and the properties of the spectra of operators acting on tensor products of Hilbert spaces. The contributions of Fox to the spectral theory of such operators, on the other hand, are a necessary and major part of the application of intermediate problems to lithium and other atoms. 7. CONCLUDING REMARKS In the brief exposition of the present chapter we have given an overview of the applicability of the methods of intermediate problems to quantum theory. As a result of our attempt to emphasize what we feel to be the highlights in this regard, we have necessarily omitted many contributions to intermediate prob- lems for eigenvalues, which are important and interesting in their own right. For instance, we were not able to go into detail here about the work of Lowdin and his collaborators which includes applications of intermediate problems and closely related methods to problems of quantum chemistry. A fairly complete bibliography of their work may be found by referring to Lowdin, (1965, 1968), Stenger (1974) and Weinstein and Stenger (1972). On the other hand, we did devote a little more space to recent developments by Fox in applying inter- mediate problems to lithium, since the latter results appeared after the publication of Weinstein and Stenger (1972) and could not be included there. Anyone interested in more details about solving intermediate problems of the first- and second-types, the applications of intermediate problems to classical mechanics, the relationships and various inequalities for eigenvalues and results in functional analysis connected with intermediate problems, is referred to the books Gould (1966) and Weinstein and Stenger (1972), to the large number of primary references cited in these books, and the more recent papers given in the references here. REFERENCES -7.418S A, -7.252 -5.123 ssAu -5.109 The bounds were subsequently improved by Fox and Sigillito (1972c). In particular, the lower bound for the first point in the essential spectrum f or S = 2 was increased to -7.294. This result, together with the lower bound -5.123 Aronszajn, N. (1951) 'Approximation methods for eigenvalues of completely continuous symmet- ric operators,' Proc. Symp. Spectral Theory and Differential Problems, Stillwater, Oklahoma, pp. 179-202. Bazley, N. W. (1959) 'Lower bounds for eigenvalues with application to the helium atom,' Proc. Nat. Acad. Sci. U.S.A., 45, 850-853. Bazley, N. W. (1960) 'Lower bounds for eigenvalues with application to the helium atom', Phys. Rev., 129, 144-149. Bazley, N. W. (1961) 'Lower bounds for eigenvalues', /. Math. Mech., 10, 289-308. 290 Uncertainty Principle and Foundations of Quantum Mechanics Bazley, N. W. and Fox, D. W. (1961a) 'Lower bounds for eigenvalues of Schrodinger's equation', Ba^tey N*W and' Fox, D. W. (1961b) 'Truncations in the method of intermediate problems for lower bounds to eigenvalues', J. Res. Nat. Bur.Std. Sec. B, 65, 105-111. Bazley, N. W. and Fox, D. W. (1962a) 'Error bounds for eigenvectors of self-adjoint operators , J. Res! Nat. Bur. Std. Sec. B., 66, 1-4. ..,,,»,, ,u ou .„ -x Bazley, N. W. and Fox, D. W. (1962b) 'A procedure for estimating eigenvalues , /. Math. Phys., J, 469-471 Bazley N. W. and Fox, D. W. (1962c) 'Lower bounds to eigenvalues using operator decomposi- tions' of the form B*B\ Arch. Rational Mech. Anal., 10, 352-360. Bazley, N. W. and Fox, D. W. (1963a) 'Error bounds for expectation values , Rev. Mod. Phys., 35, 712—715 Bazley, N. W. and Fox, D. W. (1963b) 'Lower bounds for energy levels of molecular systems', /. Math. Phys., 4, 1147-1153. . Bazley, N. W. and Fox, D. W. (1964) 'Improvement of bounds to eigenvalues of operators ot the form T*T, J. Res. Nat. Bur. Std. Sec. B., 68, 173-183. Bazley, N. W. and Fox, D. W. (1966a) 'Methods for lower bounds to frequencies of continuous elastic systems', Z. Angew. Math. Phys., 17, 1-37. . .- , * Bazley, N. W. and Fox, D. W. (1966b) 'Error bounds for approximations to expectation values ot unbounded operators', /. Math. Phys., 7, 413-416. , Bazley, N. W. and Fox, D. W. (1966c) 'Comparison operators for lower bounds to eigenvalues , J. Reine Angew. Math., 223,142-149. Fichera, G. (1965) Linear Elliptic Differential Systems and Eigenvalue Problems (Lecture Notes in Mathematics), Springer, New York. . Fox D W (1968) Separation of variables and spectral theory for self -adjoint operators in HUbert space, Informal Report, Applied Mathematics Group, Applied Physics Laboratory, The Johns Hopkins University, Silver Spring, Maryland. ; Fox, D. W. (1972) 'Lower bounds for eigenvalues with displacement of essential spectra , biam J. Math. Anal., 3, 617-624. . Fox, D. W. (1975) 'Spectral measures and separation of variables , /. Res. Nat. Bur. !>ta., ^o Fox! D/W. and Sigillito, V. G. (1972a) 'Lower and upper bounds to energies of radial lithium', Chem. Phys. Letters, 13, 85-87. Fox, D. W. and Sigillito, V. G. (1972b) 'Bounds for energies of radial lithium , /. Appl. Math. Phys., 23 392—411 Fox,D. W. and Sigillito, V. G. (1972c) 'New lower bounds for energies of radial lithium', Chem. Phys. Utters, 14, 583-585. ,,„•„• Gould, S. H. (1966) Variational Methods for Eigenvalue Problems : An Introduction to the Weinstein Method of Intermediate Problems, 2nd. ed., University of Toronto Press. ^ Kato, T. (1951a) 'Fundamental properties of Hamiltonian operators of Schrodinger type , Trans. Amer. Math. Soc, 70, 195-211. Kato, T. (195 lb) 'On the existence of solutions of the helium wave equation , Trans. Amer. Mam. Soc., 70, 212-218. Kato, T. (1966) Perturbation Theory for Linear Operators, Springer, New YorK. Kemble, E. C. (1958) The Fundamental Principles of Quantum Mechanics, Dover, New York. Kinoshita, T. (1957) 'Ground state of the helium atom', Phys. Rev., 105, 1490. Kuroda, S. T. (1961) 'On a generalization of the Weinstein- Aronszajn formula and the infinite determinant', Sci. Papers College Gen. Ed. Univ. Tokyo, 11, 1-12. Lowdin, P. O. (1965) 'Studies in perturbation theory XI. Lower bounds to energy eigenvalues, ground state, and excited states', /. Chem. Phys., 43, S175-S185. Lowdin, P. O. (1968) 'Studies in perturbation theory XIII. Treatment of constants of motion in resolvent method, partitioning technique, and perturbation theory', Intern. J. Quantum Chem., 2,867-931. , T T _ „ Reid, C. E. (1972) 'Intermediate Hamiltonians for the lithium atom , Intern. J. Uuantum Chem., 6, 793-797. Stenger, W. (1968) 'On the variational principles for eigenvalues for a class ot unbounded operators', /. Math. Mech., 17, 641-648. Stenger, W. (1969) 'On perturbations of finite rank', /. Math. Anal. Appl., 23, 625-635. Stenger, W. (1970a) 'Some extensions and applications of the new maximum-minimum theory of eigenvalues', /. Math. Mech., W, 931-944. Stenger 291 Stenger, W. (1970b) 'On Fichera's transformation in the method of intermediate problems', Rend. Accad. Naz. Lincei., 48, 302-305. Stenger, W. (1974) 'Intermediate problems for eigenvalues', Intern. J. Quantum Chem., 8, 623-625. Weinberger, H. F. (1959) A Theory of Lower Bounds for Eigenvalues, Tech. Note BN-183, IFDAM, University of Maryland, College Park, Maryland. Weinstein, A. (1935) 'On a minimal problem in the theory of elasticity', /. London Math. Soc, 10, 184-192. Weinstein, A. (1937) 'Etudes des spectres des equations aux derivees partielles de la theorie des plaques elastiques, Memor. Sci. Math., 88. Weinstein, A. (1974) 'On non-self-adjoint perturbations of finite rank', J. Math. Anal. Appl, 45, 1-11. Weinstein, A. and Stenger, W. (1972) Methods of Intermediate Problems for Eigenvalues: Theory and Ramifications, Academic Press, New York. Position Observables of the Photon K. KRAUS Physikalisches Institut der Universitat Wiirzburg, Germany 1. INTRODUCTORY REMARKS Quantum theory was initiated by Planck's discovery of the discontinuous character of light emission and absorption and Einstein's subsequent hypothesis of light quanta. From the interference phenomena of light it was already apparent that these light quanta (or photons, as we call them now) could not be particles of the simple kind considered in classical mechanics. A more precise description of the 'non-classical' behaviour of particles, however, was first given much later by quantum mechanics. Perhaps the most impressive deviation from 'classical' behaviour shows up in Heisenberg's famous uncer- tainty relation (Heisenberg, 1927) &X,-AP,*% (1) for the components of position X and momentum P of, for example, an electron.* The photon itself, however, has not yet found its way into textbooks of quantum mechanics as an example for typical quantum properties of particles. Of course some simple interference or polarization experiments with light (which at sufficiently low intensities may be interpreted tentatively as experi- ments with 'single photons') are sometimes discussed in introductory text- books. A more detailed treatment of one-photon quantum mechanics, how- ever, is usually reserved for advanced texts, for example, on quantum elec- trodynamics. This neglect of the photon is perhaps partly due to the following cir- cumstance. It has been proved (Newton and Wigner, 1949; Wightman, 1962) that there is no self-adjoint (vector) operator X in the state space of the photon which, according to the rules of ordinary quantum mechanics, could be interpreted as a position observable. Usually this is taken to indicate that, simply, photons are not localizable at all. Accordingly an uncertainty relation like (1), which is so typical for massive quantum particles, could not even be formulated for photons (or other massless particles, e.g. neutrinos). * As usual, we set ft = c = 1 in our system of units. 293 294 Uncertainty Principle and Foundations of Quantum Mechanics Quite apart from possible theoretical objections, however, the experiment itself seems to reject such a radical interpretation of the mentioned results. In fact single photons may be localized experimentally, at least above some energy threshold, by suitable detectors (counters, photographic emulsions, etc.), which moreover are very similar to the detectors for massive particles. Then, obviously, one has to ask how such experiments can be described theoretically. It is clear from what has been said before that such a description can be obtained only if the usual requirements for a position observable are somewhat relaxed. A first proposal in this direction was made by Jauch and Piron (1967) and Amrein (1969). For reasons which will be explained later, a different approach is preferred here, which has already been sketched elsewhere (Kraus, 1971), and which is based on Ludwig's reformulation of quantum theory (Ludwig, 1970). It will be shown that, starting from a suitable generalization of the notion of observables as suggested by Ludwig's theory (see also Neumann, 1971), position observables for the photon may indeed be constructed. For these position observables we will then prove, among other things, the validity of the uncertainty relation (1). Before discussing the localization problem, however, we will first have to review the usual quantum-mechanical description of single photons. This could be done in a most satisfactory way by starting from the representation theory of the Poincare (i.e. the inhomogeneous Lorentz) group (Wigner, 1939). Moreover, group theory would allow a unified treatment of other elementary particles along with the photon. Since, however, the main purpose of the present paper is neither mathematical elegance nor complete generality, we have chosen a more elementary treatment which is particularly adapted to the photon. The subsequent construction and discussion of a position observable for the photon is also very elementary in this formalism. The intentions of the present paper may be sketched as follows. First of all, we want to present the photon as just another example — perhaps a somewhat surprising one — for the universal validity of the celebrated uncertainty rela- tion (1). Secondly, a natural and useful generalization of the concept of quantum-mechanical observables will be illustrated by the example of photon position. We feel that, for both of these purposes, a fairly low level of mathematical sophistication and rigour is sufficient. Thus, more advanced mathematical techniques will be used only when (and to the extent that) they really help clarifying the matter and improving the presentation, and mathematical subtleties of a more technical type will often be omitted. 2. STATE SPACE AND ELEMENTARY OBSERVABLES OF A SINGLE PHOTON The simplest quantum mechanical description of a single free photon is the following. Pure states* correspond to unit vectors in the Hilbert space #f of Throughout this paper we will consider pure states only. The discussion of state mixtures (density matrices) is irrelevant for our present investigation. Kraus 295 complex square-integrable vector functions A(k) of a real vector k which satisfy the transversality condition k-A(k) = (2) The scalar product in #f is given by <A,A) = Jd 3 kA(k)-A'(k) (3) with a bar denoting complex conjugation. A complete system of commuting observables is given by the three momentum components P t = k, (4) (multiplication operators) and the helicity (spin component in the direction of P) P k cr = T— : •s = — • S P (o (5) Here <o = |k|, and the /-component s, of the spin operator s acts as the matrix (Sj)ki = -ieju (6) on vector functions A, with e ikt denoting the Levi-Civita symbol. From (5) and (6) the action of o- on vector functions A is easily calculated: (oA)(k) = /-xA(k) (7) & With momentum P, the energy (Hamiltonian) H is also a multiplication operator: H=\P\ = \k\=w (8) The helicity operator a has eigenvalues ±1 in #f, with corresponding eigenspaces #f±. In order to show this explicitly, we choose for each k a right-handed orthonormal set of polarization vectors ei(k), e 2 (k) and e 3 (k) = ■ (0 By (7), then, vector functions of the form A ± (k) = a ± (k)( ei (k)±/e 2 (k)) (9) with arbitrary coefficients a ± (k) are eigenfunctions of a with eigenvalues ±1. Moreover, by (2), each Ae $f may be decomposed as A(k) = A + (k)+A_(k) with suitable A ± e %€ ± . Photon states of the form A + and A_ correspond to right and left circular polarized light, respectively. For later use a natural (but 'unphysical') extension of the state space #f will now be constructed. This simply amounts to dropping the condition (2), while 296 Uncertainty Principle and Foundations of Quantum Mechanics the inner product (3) is left unchanged. The enlarged Hilbert space is called X. The definitions (4), (7) and (8) of operators P, a and H in X make sense also for vector functions AeX, and thus may be taken to define natural extensions P, a and H of these operators to the space X. In addition to (9), X also contains vector functions of the form A (k) = flo(k)e 3 (k) (10) which belong to the eigenvalue zero of & and constitute the subspace X of X. Such functions are 'unphysical' since they do not describe photon states, and consequently the assignment of 'momentum', 'energy' and 'helicity' to such 'states' by the operators P, H and & is purely formal. The concrete realization of the photon state space X given above has the advantage of providing a one-to-one correspondence between photon states and transverse vector functions A(k). The transformation law of such functions under Poincare transformations, however, looks rather complicated if written down explicitly. It is therefore better to describe it implicitly in terms of another realization of the space X. We consider the space 9 of complex four-vector functions B"(k) = {B(k),B 4 (k)} (ID with the additional requirements k v B" = k-B-<oB 4 = (Lorentz condition; fc 4 = o>) and f dfi(k)BMB"(k)<oo, d/*(k)' (square integrability).* With the inner product <B,B') = Jd/4k)iUk)B' , '(k) 9 becomes something like a Hilbert space. In virtue of (12) the space part B of any B" e 9 may be decomposed according to d 3 k (12) (13) (14) B = w i X+( k „\ k — B - = w /ft> w k \+B 4 - <a (15) into a transverse part ft>*A (i.e. k-A = 0) and a longitudinal component (k/w) • B=B 4 . Then (14) immediately yields CB,B') = jd 3 kA(k)-A'(k) (16) *Notation for four-vectors a" = {a,a 4 }, i» = 1...4: a v = a" for v = \,2, 3; a 4 --a ; aj>" - a • b— a 4 b* (sum convention). Kraus 297 (The factor w * has been introduced in (15) since we want to have d 3 k, instead of d/u.(k), in (16).) From (16), the positivity of the inner product (14) follows. Moreover, since only the transverse part of B enters (16), we see that all four- vector functions B" which differ only with respect to (k/&>) -B = B 4 represent the same vector in the Hilbert space defined by the inner product (14). In other words: Functions B" e ^with vanishing transverse part w A of B are zero vectors with respect to the inner product (14). Such functions constitute a subspace 9 of 9. Then, not 9 itself, but the space 9/9 of equivalence classes in 9 with respect to 9 , is a Hilbert space. Such an equivalence class consists of all B" of the form B" = {«*A+£ 4 £, S 4 } (17) with a given transverse A and arbitrary B 4 . By (3) and (16), the Hilbert spaces X and 9/9 may be identified in an obvious way, as already indicated by the use of the same symbol A in both cases. Thus 9/9 is also a realization of the photon state space. (This realization is formally analogous to the Fermi gauge in quantum elec- trodynamics, whereas the former one corresponds to the Coulomb gauge. There also exists a description of one-photon states which corresponds to the Gupta-Bleuler gauge, but — contrary to what happens in quantum electrodynamics — this gauge is not very useful here.) A certain disadvantage of the new formalism in the fact that, due to the arbitrariness of B 4 in (17), the correspondence between photon states and four- vector functions B"(k) is no longer one-to-one. (In fact, B"(k) and B' v (k) = B 1 '(k) + k''x(k) with arbitrary *(k), represent the same photon state. This, obviously, corres- ponds to a certain class of gauge transformations in classical electrodynamics.) This disadvantage is more than compensated, however, by the simple (four- vector) transformation law of B" e 9 under Poincar6 transformations. For a Poincar6 transformation consisting of a homogeneous orthochronous* Lorentz transformation A (with matrix A!^, v, fi = 1 ... 4) and a subsequent four- translation a (with components a", v = 1 . . . 4), this transformation law is simply B"(k)^{U(a, A)B)"(k) = e^-A^^A^k) (18) (Here A _1 k is the space part of the four- vector resulting f rom k " = {k, w } by the Lorentz transformation A -1 ). As easily shown, equation (18) defines a rep- resentation of the Poincar6 group on 9. From (18) and the Lorentz invariance dM(k) = d/i(A _1 k) of the measure d/u,(k), the invariance under (18) of the inner product (14) in 9 *The behaviour of B" complicated. under Lorentz transformations with time reversal looks somewhat more 298 Uncertainty Principle and Foundations of Quantum Mechanics is easily proved. In particular, U(a, A) transforms the space 9 into itself. Therefore it may also be interpreted as a transformation of the equivalence classes (17), and is unitary in 919*. Since 9/9 = X, finally, U(a, A) also yields a unitary transformation law for the transverse vector functions A(k) in our previous formalism. The explicit calculation of this transformation law is straightforward but unnecessary since we will not need it here. [For a particular case see equation (39).] We conclude this Section with an elementary investigation of how helicity behaves under Poincare transformations. The helicity operator a may be transferred to 9 by defining, for any B"={B,B*} = {a> k \+*B 4 ,B*}e9 (o-BY = {i-x^A, OJ = j/^xB, o} (19) It is obvious that this operator a in 9 induces a transformation of 9/9 = #f which coincides with the operator a in PC previously defined by (7). By (19) the equivalence class of a given B " e 9 describes a photon state of helicity ± 1 if and only if i'-xB = ±w s A CO -H-;* 4 ) or, with <o = k j(kxB) = ±(fc 4 B-B 4 k) The last equation can be rewritten as ie KXu Jc»B" = ±(k K B x -k x B K ) (20) with the Levi-Civita symbol e K ^„. The invariance of (20) under pure space- time translations follows trivially from (18). According to (18), both k v and B" behave like four-vectors under pure (orthochronous) Lorentz transforma- tions* whereas, as well-known, e KXtLV is a pseudotensor. Thus equation (20) is also invariant under proper orthochronous Lorentz transformations, but under space reflection the left-hand side changes sign. Therefore the helicity eigen- spaces % ± of W are invariant under proper Lorentz transformations, whereas space reflection interchanges #f+ and #?_. Many elementary particles (e.g. electron, proton, neutrinos) may be charac- terized by the fact that their state space carries an irreducible representation of the proper Poincare group. The state of such a particle is uniquely determined by the expectation values of all 'kinematic' observables, i.e. all infinitesimal generators of Poincar6 transformations (energy, momentum, angular momen- tum, etc.). This is not so for the photon, since the helicity eigenspaces #f+ and * I.e. for B'" = (U(Q, A)B)" we have B'"(k') = A^fl*(k) with k'" = A^fc*. Kraus 299 #?_ reduce the representation U(a, A) (for A proper). With respect to 'kinema- tic' observables, therefore, a coherent superposition aA+ + 0A_ (21) of normalized states A ± e #f ± (with \a \ 2 + \fi | 2 = 1) cannot be distinguished from the incoherent mixture of these states with the weights |<*| 2 and |/S| 2 , respec- tively. However, there are observables which permit such a distinction. For instance, suitable states of the form (21) (with |a| 2 = |/3| 2 = §) correspond to linear polarization, whereas the corresponding mixtures describe totally unpolarized light. Measurements of linear polarization thus do not belong to the 'kinematic' observables of the photon. The position observable to be constructed in the subsequent section will be another example for observables of this 'non-kinematic' type. 3. CONSTRUCTION OF A PHOTON POSITION OBSERVABLE In order to be acceptable as a photon position operator* in the sense of usual quantum mechanics, a (vector) operator X on the photon state space #f has to satisfy two requirements. First, its components X t have to be self-adjoint, and have to commute with each other in order to be measurable together. Secondly, the behaviour of X under spatial rotations and translations (i.e. Euclidean transformations) is prescribed to be U*(a,R)XU(*,R) = RX+a (22) Here (/(a, R) is the restriction of the Poincare group representation (18) to the Euclidean group, the elements of which consist of space rotations (or reflec- tions) R and subsequent translations a. Equation (22) is equivalent to the self-evident requirement = *<X> A +« (23) (X) t/(a,R)A ' for the expectation values <X) A = (A,XA) (24) of X in arbitrary states A. Another way of describing position measurements is as follows. Spatial localizability of the photon implies the existence of observables E(A), corres- ponding to largely arbitrary space regions A,t which take the value one (respectively zero) if at time t = the photon is found (respectively not found) inside the region A. Measurements of such E(A) should be actually feasible, at least for certain regions A, by means of suitable counters. According to the Throughout this paper we will use the Heisenberg picture. The position observables to be discussed thus refer to position measurements at a fixed time, t = say. (For the conserved quantities considered before, such specification of time was unnecessary.) tPrecisely: to all Borel sets A. 300 Uncertainty Principle and Foundations of Quantum Mechanics rules of ordinary quantum mechanics, these 'yes-no' observables have to be represented by projection operators on #f (which, for simplicity, are also called E(A)), and the probability that at time t = a photon in state A 'triggers the counter E(A) in the region A' is w A (A) = <A,i?(A)A> (25) This physical interpretation of E(A) immediately implies, with denoting the empty set and R 3 denoting all of space, E(0) = O, £( R 3 ) = 1 | (26) e(\J A,) =1 E(A t ) if A, n A ; = for i * j (The last relation follows from the additivity of the probabilities (25) for mutually disjoint regions A,.) A correspondence A-*E(A) of space regions A and projection operators E(A) with the properties (26) is called a spectral measure on R 3 . Equations (26) imply that any two E(A) commute with each other, and that H(nA i )=n£(A,) E(A 1 vA 2 ) = E(A 1 )+E(A 2 )-E(A 1 nA 2 ) E(A') = 1-E(A) with A' denoting the complement of A. The requirement corresponding to (23) is now w x (A) = w UiayR)A (A a , R ) with (27) (28) A, jR =i?A+a = {x|x = i?y+a,yeA} (i.e. the region obtained from A by the rotation R and translation a), and is also self-explanatory. Since the state A in (28) is arbitrary, this condition is equivalent to • U(a,R)E{A)U*(*,R) = E(A a , R ) (29) A spectral measure on #f which satisfies (29) is called Euclidean covariant with respect to the given representation [/(a, R), or simply: covariant. The equivalence of the two descriptions of position measurements follows from the fact that one may construct X if the E(A) are given, and vice versa. Assume first the spectral measure E(A) to be given. From the physical interpretation (5) of <A,^(A)A> we conclude that, with d£(X)=.E(d 3 x), Kraus 301 } x,(A, d£(x)A) is the expectation value* of the y'th photon coordinate in state A. In order to represent this as (A, X, A) with the component X f of a position operator X, we have to take Kj=[ Xj dE(x) (30) X< This formula, indeed, defines three self-adjoint operators X h and provides a common spectral representation of them. The more f amiliar spectral represen- tations A}= [AdE y (A) (31) with one-dimensional spectral families J5 ; (A) follow from (30) if we define £,(A)=.E(A M ), A /A ={x|^<A} Therefore the operators Xj commute with each other in the sense that ) [ J B / (A),£' Jt (/i)] = for all/, k, A and fi (32) which is somewhat stronger than 'naive' commutativity, i.e. [X h X k ] = (33) (on the dense domain where the left-hand side exists). Vice versa, any three self -adjoint operators X t which commute in the sense of (32) possess a common spectral representation of the form (30). The projection operators E(A) of the corresponding spectral measure may be calculated explicitly as £(A) = | dE(x) = \xA(*)dE(x) (34) with the characteristic function Aa(x) = 1 forxeA OforxgA of the region A. The last expression in (34) is simply the operator function ^a(X) of X, in accordance with the physical meaning of E(A). Finally the covariance requirements (22) for X and (29) for E(A) may also be shown to be equivalent. The result of Newton and Wigner (1949) and Wightman (1962) is, simply, that the photon does not possess a position observable with the required properties. [The non-existence of an operator X was first proved by Newton and Wigner (1949) who, however, needed some additional assumptions for their proof . Later on Wightman (1962) was able to prove the non-existence of a covariant spectral measure 2? (A) without any additional requirements.] We do not want to reproduce these proofs here, but will start instead with a naive attempt to construct a photon position operator X explicitly. The failure of this *This makes sense as a Stieltjes integral. 302 Uncertainty Principle and Foundations of Quantum Mechanics attempt will then illustrate the 'no-go' theorem of Newton, Wigner and Wightman. Besides this, however, a suitable refinement of this construction will lead us directly to a (generalized) position observable of the photon. By (3) and (4), A(k) may be interpreted as the photon wave function in the momentum representation.* In analogy to ordinary quantum mechanics, we thus attempt to define a position operator X by (A,A)(k) = /— A(k) (35) However, if applied to Ae X these operators ft, destroy the transversality (2) since, in general k -^ A =''ir( k - A >- /A '- (36) Bkj dkj is not zero if k • A = 0. This difficulty is absent if we read (35) as defining operators ft, on the larger Hilbert space ft introduced in Section 2. Equation (35) then defines three self-adjoint X, on ft, which commute with each other in the sense of (32). In fact, ft, acts as multiplication by x, on the position space wave functions A(x) obtained from A(k) by Fourier transformation, and the spectral projections £(A) of X then correspond to multiplication by the characteristic functions * A (x) of the regions A. The difficulty indicated by (36) may now be circumvented as follows. Denoting by <D the projection operator which projects % onto its physical subspace 3ft and by 0\ x the restriction to 3t of an operator 6 acting on ft, we define operators X, on 3€ byt X, = <S>X,\ X (37) This definition is chosen such that, for Ae 3t, (A,X / A) = (A,i' / A) (38) In this sense the operators X, on X are substitutes for the ft, which lead out of X. Since ft, is self-adjoint, (38) implies that <A, X,\) is real; therefore the X, are at least symmetric. (We claim that they are even self-adjoint, but since this property is unessential here we did not try to prove it.) A little detour is appropriate if we want to discuss Euclidean transformations of these operators X,. The transformation law of state functions A(k) e 3t under Euclidean transformations follows from (18) as (£/(«, i?)A)(k) = e-' k -U AClT'k) (39) As expected, A behaves as a vector under space rotations. This transformation law may be extended quite naturally to the enlarged Hilbert space X by taking ({/(a, i?)A)(k) = e-' k -* A0R -1 k) (40) ♦Namely, A(k) • A(k) is the probability density in momentum space correspondin| to a (nor- malized) state Ae X. For this it is essential that the inner product (3) is denned with d k instead of tit is easily proved that (37) yields operators whose domain of definition is dense in X. Kraus 303 for A € ft as well. [For clarity of notation we have used the symbol {/(a, R) for the extension of U(a,R) to ft. This extension is related to — and consistent with — our previous extensions of momentum and helicity operators to ft; in fact, the latter may be expressed in terms of infinitesimal generators of U(a, R).] Since under U(u, R) both k and A transform as vectors, the longitud- inal part Ao = (k/o))[(k/(o) • A] and the transverse part A tr = A-Ao of an arbitrary Ae ft transform separately under U(a, R). This implies [t/(a,/?),«5] = 0, U(»,R)\ x =U(a,R) (41) i.e. <I> reduces U(a, R), and the subrepresentation of £/(a, R) in %t is U(a, R). A straightforward calculation with (35) and (40) yields U*(a, R)XU(a, R) = RX+a (42) as to be expected from the vector character of X = iV k . Together with (41) this immediately leads to a transformation law of the desired form (22) for the operator X defined by (37). However, we know from Newton and Wigner (1949) and Wightman (1962) that our construction of a photon position operator has to fail somewhere. This failure is indeed easily seen. With (35), (36) and (37) we obtain explicitly, for Ae^, From this we find by a trivial calculation that the commutativity condition (33) is violated by our X. In the rest of this Section we will try to show that X, in spite of not being an ordinary photon position operator, nevertheless may have something to do with photon position. As already mentioned, the operator X-on ft has self-adjoint and mutually commuting components. By (42) it also satisfies the transformation law required for a position operator. Therefore the spectral measure E(A) associated with X is covariant with respect to U(a,R). Any difficulties associated with the photon position operator would thus be absent if, instead of 3ft, the enlarged Hilbert space ft were the physical state space of the photon. This suggests the following tentative description of position measurements for photons. We consider X on ft, or the corresponding spectral measure E(A), as operators representing the photon position, which allows us to satisfy the usual requirements at least formally. We deviate from ordinary quantum mechanics, however, to the extent that not the whole Hilbert space ft but only the subspace 3t of it is interpreted as the state space of the photon. Accordingly we interpret, for physical states A e %t (and only for them), (X) A = <A,XA) as expectation value of the photon position, and h> a (A) = <A,£(A)A> (44) (45) 304 Uncertainty Principle and Foundations oi Quantum Mechanics as probability for finding the photon in the space region A. These definitions satisfy the covariance requirements (23) and (28), as easily checked: <XW,k)a = (U(a, R)\, X£/(a, R)A) = (U(a,R)A,XU(a,R)A.) = <A,(J?X+a)A) = J R<X) A +a by (41) and (42); similarly, (28) follows from (41) and the covariance of £ (A). The unphysical Hilbert space $ can be eliminated altogether from this description. By (38), equation (44) may also be written as (X) A = (A,XA> with X defined by (37), and (45) may be reformulated as w A (A) = <A,F(A)A> with operators F(A) on X defined by F(A) = $F(A)U From 0< w A (A) < 1 for all normalized states A we get F(A)* = F(A), 0<F(A)<1 (46) (47) (48) (49) or in words: all F(A) are self-adjoint, non-negative and bounded in norm by one. They are in general not projection operators, except for particular regions A like and IR^ (see below). This follows from a simple mathematical result: For two projection operators E x and E 2 , ) ExE^Ex is a projection operator if and only > if E x and E 2 commute* (50) Assume all F(A) to be projection operators, which by (48) means that all $F(A)<J> are projection operators on £ Thus, by (50), $ commutes with all F(A), and therefore also with X, = J x L dE(x). This, however, is a contradiction since, by (36), there are Ae X with X^fX. The spectral measure F(A) satisfies relations of the form (26) and (27). Together with (48) this leads to similar relations for the operators F(A). We obtain from (26) F(0) = O, F(M 3 ) = r, F(UA i )=IF(A 1 ) if A,nA y = for iV/ * Proof- Let F = E l E 2 Ei be a projection operator. Then F 2 = E l E 2 E 1 E 2 Ei = F. This implies, for A = E 2 E 1 -E i E 2 E 1 , that A* A = 0, and thus = A = A* = A*-A =[E 1 ,E 2 \. The converse ts well known. \ and from (27) — or directly from (5 1) — F(A X u A 2 ) = F(A0 +F(A 2 ) -F(Ai n A 2 ) F(A') = 1-F(A) Kraus 305 (52) The first relation of (27), however, has no simple analogue for the operators F(A). We also cannot conclude from (5 1) that the operators F(A) commute with each other.* The physical interpretation of (51) and (52) in terms of the probabilities (47) is obvious. Any correspondence A-»F{A) of space regions A and operators F(A) with the properties (49) and (51) is called here, as usual, a POV (positive operator valued) measure on R 3 . Our POV measure F(A) satisfies the additional condition U(a,R)F(A)U*(a,R) = F(A a , R ) (53) and is therefore called Euclidean co variant with respect to U(a, R). Equation (53) follows immediately, since the condition (28) is satisfied for w A (A) as given by (47), with A arbitrary. A (covariant) spectral measure, clearly, is a particular case of a (covariant) POV measure, distinguished by the additional property that(F(A)) 2 = F(A)forallA. With this terminology, the formalism proposed here may be characterized by the fact that it describes the localization probabilities h> a (A), via (47), in terms of a covariant POV measure F(A) instead of, as usual, a covariant spectral measure. The 'position operator' X introduced in addition is already deter- mined by F(A). From X = \ x dF(x) we obtain, for A e W and dF(x) = F(d 3 x) = 3>dF(x)|*, -XA = <DXA = * J x dF(x)A or, shortly, and thus = [ x<D dF(x)A = [ x dF(x)A X = |xdF(x) (A,XA)=[x(A,dF(x)A> (54) (55) [For a physical interpretation of (55) compare the discussion of (30). As shown by (54), the POV measure F(A) provides a substitute for the non-existent common spectral representation of the three components X f of X.] The converse, however, is not true: There is no general procedure for reconstruct- ing the POV measure F(A) from the operator X related to it by (54) (unless F(A) is known to be a spectral measure, which case was discussed above). This is due to the fact that a given operator X may have several representations of *In fact they do not commute, for otherwise equation (54) below would imply commutativity of the components X f of X. 306 Uncertainty Principle and Foundations of Quantum Mechanics the form (54) with different POV measures F(A), as will be shown by means of an example in Section 6. Moreover, as compared to the case of an 'ordinary' position operator (i.e. one belonging to a spectral measure), the knowledge of X is also less useful here from a physical point of view. Whereas in both cases the expectation values of position in arbitrary states A may be calculated in terms of X as (A, XA), the mean square deviations A A AT y are given by the familiar formula (A A A}) 2 = ||(X, - <X y > A )A|| 2 = II*, A|| 2 " «*/>a) 2 (56) for an 'ordinary' position operator X only. For our position observable given by the POV measure F(A), the physical meaning of <A, dF(x)A) implies (A JC,f = j (x, - * ; ) 2 <A, dF(x) A), x, - (X f h (57) With (48) we get from this (A A X ; .) 2 = | (*, -x,) 2 <A, dE(x)A) = ||(^.-<^) A )A|| 2 = ||X / A|| 2 -«^) A ) 2 (58) which does not reduce to (56) since, in general, A} A * X,A. A formal descrip- tion of position measurements for a photon in terms of the 'position operator' X is thus incomplete, in contrast to the description in terms of the POV measure F(A). On the other hand,F(A) is fixed uniquely by the operator X on the extended Hilbert space, since X uniquely determines F(A). As exemplified by (44) and (58), important physical quantities may also be calculated directly in terms of X. From the point of view of quantum mechanics in its usual form, our description of position measurements in terms of a POV measure F(A) looks at least rather unconventional. Moreover, the explicit construction of F(A) described above is quite heuristic. Before looking for a better theoretical justification of the formalism, however, we will first derive some physical consequences from it. If these consequences look reasonable, this may perhaps help to strengthen the subsequent, more theoretical arguments in favour of our approach. 4. POSITION-MOMENTUM UNCERTAINTY RELATION AND TIME DEPENDENCE OF POSITION MEASUREMENTS We start from Schwarz's inequality ||A'|| • ||A"|| > |<A\ A")| > |lm <A, A">| = i|<A', A") - <A", A')| (59) for arbitrary vectors A' and A" in $t From this we obtain, for a normalized state Kraus 307 vector A e #f in the domain of definition of both X t and P, (or, equivalently, of both Xj and Pj), the estimate ||(Ai - x,) A|| • ||(P y -p,)A|| s \\(X A, P, A) - {Pi A, X, A>| (60) with *,=<A,> A , py = <P y > A . (Take A' = (Ai-Jf,)A and A" = (P y -p,)A = (Pj -p ; )A, and note that the terms with x, and p t cancel in (A', A")- (A", A').) According to (58) and the analogue of (56) for the 'ordinary' observables P h the left-hand side of (60) is equal to A A X t • A A F ; . A simple calculation, using the explicit definitions (35) of X and (4) of P„ yields (XB, P ; B') - (P f B, XB') = iS u (B, B'> (61) for arbitrary vectors B and B' in it belonging to the domain of definition of both X and Pj. An alternative, more abstract proof of (61) uses the transformation property (42) of X and the fact that P, is the self-adjoint generator of translation along the /th axis, i.e. e ikp iX e~ ikp i =X+ ^ (62) as follows: (XB, e ,A ^B') = (B, X e' A/5 'B'> = <B, e Ap i(X - A5 (; )B') = <e-' A ^B, XB')- ASijiB, e' A# 'B') by (62), and thus (XB,PjB')-(PjB,XB') = - -£r((XB, e'^B')-<e-''^B, AiB'))| A=0 i dA = ^(A5 iy <B, e' Ai> 'B'»| A=0 = iS^B, B') From (60) and (61) we obtain Heisenberg's position-momentum uncertainty relation a a a;.a a p,>&, (63) for all normalized photon states A of the type specified above. As apparent from (58) for A A A), and from the analog of (56) for A A F /( such states are the only ones for which both A A A, and A A F ; are finite. For all other states, therefore, (63) is satisfied in a trivial way. Some readers might find our derivation of (63) rather pedantic. They might feel it would be easier to use the commutation relation [a;, />.] = /$, for an evaluation of the right-hand side of (60) in the form (X A, P, A) - & A, XA) = <A, [X, Pj]A.) = iS u (65) 308 Uncertainty Principle and Foundations of Quantum Mechanics This short-cut calculation, clearly, is not perfectly rigorous. That it may even lead to wrong physical conclusions is explained elsewhere (Kraus, 1970). As mentioned before (see footnote at beginning of Section 3), the POV measure F(A) describes position measurements at time t = in a given inertial frame. Therefore the operators U(a, A)F(A)C/*(a, A) describe position meas- urements at time t ' = in a 'primed' inertial frame, generated from the original 'unprimed' one by the Poincare transformation (a, A). Of particular interest is the case where (a, A) is a pure time translation by the amount t, which leads to the POV measure F,(A) = C/(0F(A)[/*(,) | (w) U(t) = U({0,t},l) = e' H ') As the corresponding measurements, obviously, may also be interpreted as position measurements at time t in the original inertial frame, equation (64) is nothing but the familiar time dependence of the observables F,(A) in the Heisenberg picture. The same transformation F(A)-»F,(A) is also obtained from F,(A) = U(t)E(A)U*(t)) F r (A) = *£,(A)|* i with a unitary operator U{t ) on $C satisfying [{7(f), <*>] = (), U(t)\ x =U(t) (66) In fact, (65) and (66) imply F(A) = *4»(A)|w = ®U{t)E{A)U*{t)\ x = U{t)<S>E{A)\ x U*{t) = U(t)F(£)U*(t) in accordance with (64). The condition (66) is satisfied, for example, by U(t) = t i " t (67) with the 'natural' extension H of the Hamiltonian H described in Section 2. Of course, (66) has many other solutions besides (67), but this particular one is very convenient since it permits the explicit calculation of the self-adjoint operator X, = U(t)XU*(t) (68) which corresponds to the spectral measure F r (A). Indeed, a simple calculation using the explicit expressions X = /V k , U(t) = e ia " yields X, = X+Vf (69) with the (vector) multiplication operator H co Kraus 309 (70) on $£. Implicitly equation (69) contains the complete solution of the Heisen- berg equation of ^motion (64) for F,(A), since X, uniquely determines its spectral measure F,(A) which in turn, via (65), yields F,(A). This does not mean that (69) is really helpful for the explicit calculation of F,(A)'s. However, in most cases one is satisfied with a much less detailed description of position measurements at different times, and in such cases (69) may be applied directly. Consider, for instance, the time-dependent expectation value (X t ) A of position in a given state A. We obtain with and <X,) A = <A, X, A) = <A, XA) + <A, \X)t = (A, X, A) = <A, XA) + (A, \A)t X, = ®X t \ 9); = X+\t P k H co (71) (72) (73) a multiplication operator on $? which, obviously, has to be interpreted as the photon velocity operator. Its components V} satisfy the relations -1<V;<1, \y\ = (V\+Vl+Vl) k = l (74) as to be expected from this interpretation.* As a k space average with weight function A(k) • A(k) of the unit vectors k/w, the vector (V) A = (A, VA) (with components ( Vj) x = (A, VjA)) has a length |(V) A | smaller than one.t Thus (71) implies £<X,) A = <V> A , |<V> A |<1 (75) at i.e. the time-dependent average photon positions (X t ) A lie on a straight time-like worldline. For the time dependence of the mean square deviation of the /th photon coordinate, an obvious generalization of (58) together with (69) yields with A A X y , = IK*,, - x it )A || = %#, - £,) A +(Vj- v,)t A|| Xj = (X}> A , v, = < V,) x , x jt = (X jt ) x = Xj + Vjt Note that in our units the light velocity is equal to one. tThe value one is excluded since, for normalized A, the weight function A(k) • A(k) cannot be a delta function. 310 Uncertainty Principle and Foundations of Quantum Mechanics * and thus A A X„ =£ ||tf, -x,)A|| + \t\ ||( V, - »,)A|| But ||(x 7 .-^.)aII = AaX 7 . whereas ||(V;.-tJ / )A|| 2 = ||(^-iJ,)A|| 2 = (A A ^) 2 = ||V>i|| 2 -tJ^||VAA|| 2 <l in virtue of (74).* Therefore we obtain the estimates A A *,,<A A x,+A A v;.k|, a a v;<i (76) which show that the growth in time of the mean square deviations of photon coordinates (or, if, translated into the Schrodinger picture, of the widths of a wave packet in the directions of the three coordinate axes) is also restricted by the velocity of light. By using rotational invariance [or a simple generalization of (58), compare equation (79)], a similar estimate may be derived for the width A A (e • X r ) of the wave packet A in the direction of an arbitrary unit vector e. The relations (75) and (76) express a certain kind of causal behaviour of position measurements. If, for instance, there were wave packets A with average positions <X,) A moving faster than light, this would hardly be consis- tent with relativistic causality, since one could easily imagine the use of such wave packets as faster-than-light signals. Likewise, the existence of wave packets with average widths A A X, r growing with superluminar velocity would look suspect from the point of view of causality, although it is not at all obvious how such wave packets could be used to exchange signals between space-like separated observers. Another causality requirement for successive position measurements would be the following: If, in some state A, the photon position at time t = is certainly inside a region A, then at any time t* the photon has to be with certainty in the region (77) i.e. A, = {x||x-y|<fforallyeA} <F, F(A)A) = 1 implies <A, F,(A,)A> = 1 for all t (78) As the corresponding property for classical particles is obvious, the postulate (78) seems to be well-founded, too. However, (78) is simply wrong, and is moreover wrong not only for photons and the POV measure F(A) considered here but, quite generally, for all relativistic elementary particles and for all conceivable position observables. More precisely, it turns out that (A, F,(A,)A> ♦Namely, (74) implies j| V,|| = 1, and thus ||V,A||=s||A|| = 1. 1| V,A|P = (A, VJA) = 1 would mean that Aisaneigenstateof Vf of eigenvalue one, but the spectrum of V] = k j /a> z is purely continuous. Kraus 311 is strictly less than one for all states A with (A, F(A)A) = 1 and all times t¥^ 0. In this generality, the result is due to Hegerfeldt (1974); we refer to his paper for the (surprisingly simple) proof. Of course, the requirement (78) is non- trivial only if there are at least some regions A and corresponding states A with (A, F(A)A) = 1. In our present case, however, it is easily seen that such states A indeed exist for an arbitrarily given region A.* On the other hand, it is clear from (75) that only the 'tail' of the wave packet can be outside of A, at time t. Indeed, its centre (X r ) A is certainly in A, for all t since, at least for convex regions A, it is in A for t = 0. A similar intuitive picture of the spreading of wave packets is suggested by the following estimate. With x = (X) A and an arbitrary unit vector e, we define by (5(e, t)f = | (e • (x-x)) 2 <A, dF,(x)A> a measure 5(e, t) for the average spatial distance, in the direction of e, of the wave packet at time t from its centre x at time t = 0. Since S(e,/) = ||(e-(X,-x))A|| (compare the derivation of (58)), we find from (69): fi(e,Os||(e-Cfc-i))A|| + M||(e.V)A|| Now A A (e • X) = ||(e • (X-x))A|| - S(e, 0) (79) clearly, is the width of the wave packet, measured along the direction of e, at time t = 0, whereas by (74). Thus, finally, (e - V) A|| = ||(e • V) A|| s ||A|| = 1 5(e,r)<A A (e-X) + M (80> The physical meaning of this estimate is obvious. Estimates analogous to (75), (76) and (80) may also be derived, e.g., for the Newton-Wigner position operator of a particle with mass m > 0. [In this case the velocity operator is P/// = P(P 2 +m 2 r 5 .] We feel that the violation of (78) should not be taken to indicate an 'acausality' until it is proved that its violation indeed makes possible, at least in principle, the exchange of faster-than-light signals. It seems not implausible at least that the impossibility of such signals already follows from estimates like (75) or (80). This problem, clearly, should be investigated further. In any case, however, the violation of (78) is a nice example of how misleading the classical particle picture may be in quantum mechanics. *Since^ i?(A) acts as multiplication by * A (x) on the Fourier transform A(x) of A(k), we have £(A)A = A, and thus (A, F(A)A> = 1, if A(x) - outside of A. 312 Uncertainty Principle and Foundations of Quantum Mechanics 5. EFFECTS AND GENERALIZED OBSERVABLES The simplest measurements are those for which only two different outcomes 'yes' and 'no', or one and zero, are possible. In usual quantum mechanics these 'yes-no' observables are represented by projection operators E on the state space X, such that <A, FA) is the probability for the outcome one (i.e., 'yes') if E is measured in the normalized state A € X. If applied to particle detectors, this formalism immediately leads to the description of particle position by a covariant spectral measure F(A). However, if quantum mechanical yes-no measurements are investigated more closely [e.g. by considering suitable models (Kraus, 1971 and 1974)], one realizes that they do not in general correspond to projection operators. Instead, a general yes-no measurement has to be described by an operator FonX with F*=F, 0<F<1 (81) the probability for the outcome 'yes' in state A being given again by (A, FA). The projection operators E are a very particular class of such operators F, and in fact any practically performable yes-no measurement most likely does not correspond to a projection operator. This observation may serve as the starting point for Ludwig's axiomatic reformulation of quantum theory (Ludwig, 1970). General yes-no experiments are called 'effects' in this theory, whereas the particular ones corresponding to projection operators are denoted as 'decision effects'. The corresponding generalization of the notion of an observable is almost obvious. Usually a quantum-mechanical observable is taken to correspond to a self-adjoint operator X on X, and the spectral measure F(A) on the real line obtained from the spectral representation '■I x dE(x) (82) is interpreted as follows: For a given interval A, F(A) corresponds to the yes-no observable which takes the value one (respectively zero) if for the original observable a value x in (respectively outside of) A is measured. The generaliza- tion consists of admitting observables for which a general POV measure F(A) on the real line takes the r61e of F(A), with the same physical interpretation.* From this point of view, then, the use of a covariant POV measure F(A) on Or for the description of photon detectors looks quite natural.t The properties of such generalized observables have been illustrated in Section 4 by the example of the photon coordinates X,. In particular, we have seen that the associated *In particular, we do not interpret POV measures as describing inaccurate ('fuzzy') measurements, as done by Ali and Emch (1974). _ > . tThe operators F(A) are interpreted here as describing 'exact' photon positions, i.e. a click: in the 'counter' corresponding to F(A) is taken to indicate that the photon is really inside A. We feel that one could speak meaningfully of inaccurate position measurements only if there were another, 'more exact' position observable to compare with. Kraus 313 operator X= xdF(x) (83) does not completely describe a generalized observable.* One further point, however, is worth mentioning here. By their very definition, any two effects F(A X ) and F(A 2 ) belonging to the POV measure of a generalized observable can be measured together (e.g. simply by determining the value of the observable with sufficient precision). In Ludwig's terminology, such effects are called 'coexistent'. If one is dealing with a spectral measure F(A), the well-known necessary and sufficient condition for the 'coexistence' (usually called 'commensurability' in this case) of F(A X ) and F(A 2 ) is com- mutativity, which is indeed satisfied for any spectral measure (cf. Section 3). More generally, two effects F x and F 2 are coexistent if and only if there exist three effects Fi, F 2 and F 3 such that F 1 =F\+F 3t F 2 = F' 2 +F 3 , FJ+F 2 +F 3 <1 (84) (See Ludwig (1970), or Kraus (1974) for a more elementary discussion.) Commutativity of Fi and F 2 is sufficient to guarantee the validity of (84), but is necessary only if Fi or F 2 or both are projection operators. For F x = F(A t ) and F 2 = F(A 2 ) belonging to a POV measure, (84) is satisfied in virtue of the measure property (51), which implies F(A t ) = F(Ax n A 2 ) +F(Ai n A 2 ) F(A 2 ) = F(A 2 n Ai) + F( A x n A 2 ) and F(A X n A 2 ) +F(A 2 n Ai) +F(Aj n A 2 ) = F((A X n A 2 ) u (A 2 n Ai) u (A x n A 2 )) = F(A X u A 2 ) < 1 The attempt of Jauch and Piron (1967) and Amrein (1969) of constructing a photon position observable is closely related to the one discussed here. These authors, however, insist on the description of yes-no observables by projection operators, and therefore do not accept F(A) as describing a photon counter in the space region A. They take instead, for this purpose, the projection operator F'(A) onto the subspace of eigenvectors of F(A) belonging to the eigenvalue one. An equivalent definition is E'(A) = $nE(A)\ x , with the projection operator $n£(A) onto the intersection of the subspaces X = ®X and E(A)X of X. The covariance condition (29) and the first two relations of (26) are easily seen to be satisfied for the operators F'(A). The third (additivity) condition of (26) has thus to be violated, since otherwise £"(A) would be a covariant spectral measure. Because this additivity condition has a direct physical interpretation, *There are even POV measures F(A) for which (83) makes sense only if applied to the zero vector, so that there is no operator X at all. However, such 'observables' are pathological also from the physical point of view. 314 Uncertainty Principle and Foundations of Quantum Mechanics we consider its violation as a serious disadvantage. Moreover, there are certain pairs of regions for which the corresponding operators F'(A) do not commute, and thus do not describe commensurable measurements. 6. THE UNIQUENESS PROBLEM At first sight the method used in Section 3 for constructing the Euclidean covariant POV measure F(A) for the photon might look somewhat fortuitous. This is not the case, however, as the following discussion shows. We start with Theorem 1: , (1). Consider a Hilbert spaced with a POV measure F( A) on R .Jhenthere exists an extended Hilbert space X = X with a spectral measure F(A) on W, such that F(A) = *£(A)|* $ being the projection operator on X with range X. (2). Let F(A) be covariant with respect to a continuous unitary representa- tion C/(a, R) of the Euclidean group on X. Then there exists a continuous unitary representation U(a, R) on X which extends [/(a, R), i.e. [U(a,R),&] = 0, U(a,R)\*=U{a,R) such that F(A) is covariant with respect to U(a, R).* (3). An extension X of X as described under (1) is called minimal if X is spanned by vectors of the form F(A)A, with arbitrary regions A and arbitrary vectors A e X. The space X and the spectral measure F(A) (and, for covariant F(A), also the representation £7(a, i?)) of a minimal extension are unique up to unitary equivalence. A non-minimal extension X contains a subspace which reduces the spectral measure if (A) (and the representation U(a, R), if F(A) is covariant), and which is a minimal extension of X. This Theorem is also true for POV measures on arbitrary spaces (instead of U ) and for more general co variance groups. In this general form, parts (1) and (3) of the Theorem are due to Neumark (1943) [see also Riesz and Nagy (1956)] whereas part (2) has been proved recently by Neumann (1972). If one wants to construct a covariant POV measure F(A) on a Hilbert space X with a given representation U(a,R) of the Euclidean group, Theorem 1 suggests the following procedure : First, look for a suitable extension C/(a, R ) of U(a, R) to a larger Hilbert space X, such that on ^ there exists a spectral measure E(A) which is covariant with respect to U{a,R); then, take F(A) = <&iJ(A)|*. It is easily shown that this F(A) is indeed a covariant POV measure, *If one is dealing with particles of half-integer spin, then U and U are representations not of the Euclidean group itself but of its covering group [cf., for example, Wightmann (1962)]. Krans 315 whereas Theorem 1 quarantees that every covariant POV measure may be constructed in this way. A further advantage of this construction is the fact that all representations^ U(a, R) of the Euclidean group which admit a covariant spectral measure F(A) are explicitly known up to unitary equivalence: Theorem 2: Consider a Hilbert space X with a continuous unitary represen- tation U(a, R) of the Euclidean group and a covariant spectral measure is (A) on U 3 . Then there is a unitary transformation which brings X, U(a, R) and F(A) to the following standard form: (1). X consists of all complex 'vector' functions f (k) of a real three-vector k, with 'vector' components / a (k), a el (some finite or infinite index set), which are square-integrable in the sense that fd 3 kl |/ a (k)| 2 <oo The inner product in X is (f,r)=fd 3 ki/ a (k)/;i (k) (2). C/(a,i?)isgivenby (#(«, *)/)«.*) = e"' k - I DaedDUiR- 1 *) Pel (85) with a continuous unitary representation of the rotation group by matrices D(R) with matrix elements D a0 (R), a, (3 el. (3). F(A) is the spectral measure of the self-adjoint 'position' operator X = [ x dF(x) whose components X t are defined by* (A/) a (k) = /— / a (k) This Theorem plays the crucial role in the paper of Wightman (1962), where a detailed proof is given. Representations t/(a, R) of the Euclidean group of the form (85) are highly reducible. First of all, the unitary representation D(R) of the rotation group may be decomposed in the usual way into irreducible representations D S (R) with fixed angular momentum quantum number ('spin') S,t of dimension *In the 'position' representation, i.e. in terms of the Fourier transforms f (x) of i(k), the operators A} and E(A) act as multiplication by x f and # a (x), respectively. tin our case each 5 is integer, whereas half-integer S occur in the case mentioned in the footnote to Theorem 1. 316 Uncertainty Principle and Foundations of Quantum Mechanics 25 + 1, which decomposes £/(a, R) into subrepresentations U s (*, R). For each subrepresentation U s (a, R), the representation space may be further decom- posed into 25 + 1 subspaces with definite helicities -5, -5 + 1 ... 5-1,5, which further reduce U s (a,R) since helicity is Euclidean invariant* (For details see, for example, Amrein (1969).) On the other hand, the representa- tion U(a, R) on the photon state space X may be decomposed into subrep- resentations with helicities +1 and -1 (cf. Section 2). Thus Theorem 2 forbids the existence of a covariant spectral measure on X which would require, at least, the presence also of helicity zero states. The simplest way of extending U(a',R) to a representation t/(a, R) of the type (85) is, therefore, to add just this 'missing' subrepresentation of helicity zero. This leads to a representation {/(a, R) of the form (85), with D(R) = R irreducible and belonging to 5 = 1, and exactly this has been done in Section 3. In the light of Theorems 1 and 2, therefore, the construction of F(A) in Section 3 appears quite natural. However, it is obvious now that this construction is only the simplest but not a unique one. There are very many different possibilities of embedding C/(a, R) into representations of the form (or unitary equivalent to) (85), which in general lead to different POV measures. We will illustrate this non-uniqueness by two simple examples. For instance, we can embed U(*, R) by adding, besides the missing helicity zero states, two other subrepresentations of helicities +2 and -2, so that we obtain a representation £/'(», R) with D'(R) belonging to 5 = 2. (The primes serve to distinguish the present construction from the one considered in Section 3.) This representation may be realized concretely in the Hilbert space #?' of complex symmetric traceless second-rank tensors g^k) (i, j = 1, 2, 3) with the inner product (g,g') = \d\g ii (k)g' il (k) (86) (sum convention) and the tensor transformation law (&(a, R)gUk) = e-' k -i?, r i? /r g JT (i?- 1 k) (87) The photon state space #f of transverse vector functions A(k) may be embed- ded isometrically in $" by identifying a given Ae W with the tensor s* (k) =;M» A ' (k)+ » A ' (k) ) (88) in fP. As easily checked, the transformation law (87) for tensors g„ of the particular form (88) is equivalent to the vector transformation law (39) for A, so that, by the embedding (88), C/(a, R\ becomes a subrepresentation of t/'(a, R). The projection operator $' on $" with range #f transforms a given *The subrepresentations with definite helicities are still highly reducible, since the absolute value of momentum P = k is also Euclidean invariant. Kraus 317 tensor g if e $" into a vector A e #? with components A,(k) = V2 (-^ gi/ (k)— M^k)) \(0 & (89) Since this extension $C of $f is of the form required in Theorem 2, the self-adjoint operator X' = i V k with covariant spectral measure F' (A) exists on $C, and leads to a covariant POV measure and a position operator F'(A) = 4>'F'(A)| S X' = $'X'^ (90) (91) on the photon states space ffl. A straightforward calculation with (88), (89) and (91) leads to x '> A -(i^ A ) for an arbitrary A e $f. This coincides with X/A as given by (43), and thus X' = X. On the other hand, the POV measure F(A) is different from F(A) as constructed in Section 3 (see below). Since, therefore, but X = jxdF(x) = jxdF(x) F(A)^F'(A) we have here an example for the non-uniqueness of the POV measure corresponding to a given operator X. As an example for a covariant POV measure F'(A) for which the correspond- ing position operator X" is different from X, consider F'(A) = F + F(A)F + +F_F(A)F_ (92) with F(A) as in Section 3 and the projection operators E ± onto the subspaces $f± of #f belonging to the helicities ± 1 . Obviously (92) defines a POV measure, whose covariance follows from [F±, U(a, R)] = 0. The corresponding position operator is -I x dF'(x) = E + XE + + F_XF_ (93) and is different from X since [X, F±] ^ (as easily checked by direct calcula- tion) whereas, clearly, [X", E ± ] = 0. Both examples may be easily generalized. Embedding of U(b,R) into U s (a, R) with 5 = 1, 2, 3 . . . , as described above for 5 = 1 and 2, leads to an infinite sequence of covariant POV measures F S (A), 5 = 1, 2, 3 ... , which are all different. We can simply show this as follows. It is known (and follows easily 318 Uncertainty Principle and Foundations of Quantum Mechanics from Theorem 2) that, for am/ given 5, all unitary operators U s (a, R) together with all spectral projections F S (A) form an irreducible set of operators on the representation space H s . Statement (3) of Theorem 1 then implies that the extensions $t s of M are minimal and, consequently, that Fs(A)^F s .(A) if 5 ¥■ S'.* Instead of (92) we may consider, more generally, F'(A) = SAfsF s (A)A« i,S (94) with F S (A) as above and (finitely or infinitely many) operators A iS satisfying [A iS ,U(a,R)] = 0, lA*sA iS -- i,S (95) It may be proved from (95) that (94) indeed defines a covariant POV measure.t Since the representation U(a,R) is highly reducible, there are very many different sets of operators A iS which satisfy (95) and which, in general, will also lead to different POV measures F'(A). By Theorem 1, each F'(A) may also be obtained from a suitable extension of W and U(a, R). However, except for particularly simple cases like (92), such an extension is expected to look rather complicated. We have seen that the covariant POV measure F(A) =Fj(A) constructed in Section 3 is very far from being unique. On the contrary, the diversity of possible candidates for a photon position observable might appear really bewildering. Moreover, any covariant POV measure leads to Heisenberg's position-momentum uncertainty relation (63) and to the photon velocity operator V given by (73), and is thus acceptable as a photon position observa- ble also from this point of view. This follows from Theorem 1 by a straightfor- ward generalization of the reasoning applied in Section 4. One could try to reduce this non-uniqueness, as done by Wightman (1962) for the particular case of spectral measures, by exploiting suitable additional postulates like time reversal invariance and 'smoothness' in momentum space. With such addi- tional assumptions Wightman was able to prove uniqueness of the Newton- Wigner position observables. However, one cannot hope to obtain uniqueness by this method in the case of general POV measures unless Wightman's additional postulates are sharpened considerably because, for example, all F S (A) and many F'(A) of the form (94) are both time reversal invariant and 'smooth' in momentum space. Therefore we do not believe that the 'true' photon position observable, i.e. the one which describes real position measurements, can be determined by purely kinematic considerations. In this respect we fully agree with Wightman (1962), who wrote: 'All investigations of localizability for relativistic particles up to now . . . construct position observables consistent with a given transfor- mation law. It remains to construct complete dynamical theories . . . and then *Presumably, however, the corresponding position operators X s are all equal. (At least X! = X 2 , see above.) tThis is trivial if one is dealing with finitely many operators A js . Kraus 319 to investigate whether the position observables are indeed observable with the apparatus that the dynamical theories themselves predict.' At present the only candidate for a 'complete dynamical theory' of elementary particles is quantum field theory, and photon localization experiments should thus be investigated in the framework of quantum electrodynamics if one wants to go beyond pure kinematics. Such an investigation is also expected to allow a more profound treatment of the causality problem mentioned at the end of Section 4. Since quantum electrodynamics describes photons as 'quanta of a vector field', we are tempted to speculate that the 'vector' POV measure F t (A) of Section 3 might be distinguished from this point of view. An additional argument in favour of F t (A) is simplicity. It is therefore not unreasonable to consider Fi(A), in spite of its non-uniqueness, as describing actually realizable photon detec- tors. The generalization of the present discussion to other elementary particles is almost obvious. For a massive particle, for instance, one finds that there exist infinitely many generalized position observables besides the usual Newton- Wigner position operator. The latter, however, is distinguished by the fact that it is the only 'ordinary' position observable.* For this reason, the non- uniqueness problem appears not to be so serious in this case. Like the photon, also the neutrino does not possess an 'ordinary' position observable (Wight- man, 1962) whereas it is very easy to construct, via Theorems 1 and 2, generalized position observables. Again one of them is distinguished by simplicity. It is an additional advantage of the present approach as compared to the one of Amrein, Jauch and Piron that the latter does not provide a theoretical description of neutrino position measurements (Amrein, 1969). Acknowledgment I would like to thank Georg Reents and Michael Everitt for critical readings of the manuscript. REFERENCES Ali, S. T. and Emch, G. G. (1974) 'Fuzzy observables in quantum mechanics', /. Math. Phys., 15, 176-182. Amrein, W. O. (1969) 'Localizability for particles of mass zero', Helv. Phys. Acta, 42, 149-190. Hegerfeldt, G. C. (1974) 'Remark on causality and particle localization', Phys. Rev., D10, 3320-3321. Heisenberg, W. (1927) 'liber den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik', Z. Physik, 43, 172-198. Jauch, J. M. and Piron, C. (1967) 'Generalized localizability', Helv. Phys. Acta, 40, 559-570. Kraus, K. (1970) 'Note on azimuthal angle and angular momentum in quantum mechanics', Amer. J. Phys., 38, 1489-1490. *Note that the 'decision effects' form a distinct class of 'effects' in Ludwig's theory (Ludwig, 1970). This implies a corresponding distinction of 'ordinary' (so-called 'decision') observables. 320 Uncertainty Principle and Foundations of Quantum Mechanics Kraus K. (1971) 'General state changes in quantum theory', Ann. Phys (N. Y), 64, 311-335. Kraus, K. (1974) 'Operations and effects in the Hilbert space formulation of quantum theory , Lecture Notes in Physics, (Springer- Verlag), 29, 206-229 / n„„ 1 „„ Ludwk G. (1970) 'Deutung des Begriffs "physikalische Theorie" und axiomatische Grundlegung deTHilbertraumstruktur der Quantenmechanik durch Hauptsatze des Messens', Lecture Notes ^:2ZHA?91l^S'syst em s and observables in quantum mechanics', Commun. Math. Ne?mann,H. (1972) 'Transformation properties of observables', Helv. Phys. Acta, 45, 811-819 . NeumaX M. A. (1943) 'On a representation of additive operator set functions', Doklady Acad. Nekton! 1 ? D 4 ana 5 WigSer, E. P. (1949) 'Localized states for elementary systems', Revs. Mod. RifsM^nd Sz.-Nagy, B. (1956), Vorlesungen uber Funktionalanalysis, Anhang. Berlin: VEB Deutscher Verlag der Wissenschaften. Wightman, A. S. (1962) 'On the localizability of quantum mechanical systems , Revs. Mod. Phys., Wigner, e7(1939) 'Unitary representations of the inhomogeneous Lorentz group', Ann. Math., 40, 149-204. A New Theoretical and Experimental Outlook on Magnetic Monopoles ERASMO RECAMI Universita di Catania, Italy and ROBERTO MIGNANI Universita dell' Aquila, Italy Since experiments looking for magnetic monopoles have failed until now, and new experiments are going on, it should be interesting to know — and to take into account — the predictions of the mere special relativity on the subject. We are going to show that the mere special relativity: (1). Does not explicitly predict the existence of (slower- than-light) magnetic monopoles; (2). It does explicitly predict, on the contrary, the existence of tachyonic (i.e. faster-than-light) 'monopoles'; (3). Their unit magnetic-charge appears predicted to be about one hundred times less than that usually assumed (Dirac, 1931, 1948; Schwinger; 1966) (g = ±e, in Gaussian units); (4). Many good features of the old hypothesis about magnetic monopoles (Dirac, 1931, 1948; Schwinger, 1966) are reproduced by simply taking account of Superluminal (v 2 > c 2 ) speeds. In particular, the existence of both sublumi- nal (v 2 <c 2 ) and Superluminal 'electric' charges leads to fully symmetrical Maxwell equations (Mignani and Recami, 1974b), cf. equation (1) in the following, and possibly to the Schwinger -type relation: eg = nah. In fact, let us build anew the theory of special relativity without assuming a priori \v\<c (Recami and Mignani, 1974a). In other words, let us start from the postulates: (1). Principle of relativity: the laws of mechanics and of electromagnetism are covariant under a transition between two inertial frames, whose relative speed u is a priori — oo<u < +oo. (2). Space is isotropic and space-time homogeneous. Moreover, negative- energy particles do not exist and for every observer, physical signals are transported only by positive-energy objects. The usefulness of the last 321 322 Uncertainty Principle and Foundations of Quantum Mechanics sentence— even in standard Relativity!— has been shown by us, e.g. in Recami and Mignani (1974a). There follows an 'extended relativity' (Recami and Mignani, 1974a and references therein), in which light speed is invariant with respect to all inertial frames, both subluminal (s) and Superluminal (5), and in which tachyons do not imply (see, e.g., Recami, 1973; Pavsic, Recami and Ziino, 1976) any causality violation. What is more, the 'extended relativity' proved to be useful even for standard particle physics, since for example it allowed the derivation of the 'crossing relations' for the relativistic reactions (Mignani and Recami, 1974e), and the CPT theorem (Mignani and Recami, 1974d). It leads, incidentally, to suitable redefinitions of the discrete symmetries. The point we want to stress here is the following. If we consider the existence of electric charges both subluminal [with four-current ;^(s) = (p(s), j(.s))] and Superluminal [whose four-current is / M (S) = (p(5), j(S))], then the (generalized) Maxwell equations read (Mignani and Recami, 1974b), for ??c 2 divD = +p(s) divB = -p(S) TotE = -dB/dt + }{S) rotH=+dD/df + j(s) [v Sc ; s*+\v\<c; S*+\v\>c] (1) That is to say, faster-than-light electric charges are predicted by (extended) relativity to behave in a similar way as magnetic monopoles were supposed to do (apart from the different speed!)-cf . also Figure 1 in Mignani and Recami (1974b) In fetter words, a Superluminal electric, positive charge (e.g. with speed V > c, along the x axis) will bring into the field equations a contribution similar to that which was supposed to come from a magnetic south pole (with v = c 2 / V, along x), (Recami and Mignani, 1974a). Thus, 'tachyonic electrons will appear with north magnetic charge (+g), and 'tachyonic protons' with south magnetic charge (-g); and so on. Therefore, a Superluminal unit electric charge e will appear to us as a (tachyonic) 'monopole' with possibly the unit magnetic charge: g = - e (in Gaussian units) (2) so that in general we expect to have (when quantizing): eg = ±ah w) where a is the fine-structure constant. It follows that relativity seems \o predict a magnetic strength unit about 100 times less than that usually assumed. In other words, extended relativity predicts only one charge (let us call it 'electromagnetic charge'), which behaves-if you hke-aS 2 'electric' when subluminal (v 2 <c 2 ) and as 'magnetic' when Superluminal (v >c). Cf. again Figure 1 in Mignani and Recami (1974b). Also, Maxwell's equations may be Recami and Mignani 323 written in a fully symmetrical form [cf. equation (1)], without assuming (sub- luminal) monopole existence. What is more, the universality of electromagnetic interactions is recovered in extended relativity, since |g| = |«|, i.e. only one coupling constant essentially exists in our framework even before quantizing the theory. When passing to quantum mechanics, on the contrary, we can say the following. If one assurAes the existence of subluminal magnetic monopoles, then the simultaneous quantization of both electric and magnetic charges follows. This might suggest that even subluminal magnetic monopoles could exist, with their large unit charge. Notice, however, that the previous argument would be the only one in favour of subluminal magnetic charges, since, for example, in the present theory Maxwell equations already have a fully sym- metrical form (moreover, that argument would become even weaker if we actually succeed — when quantizing our theory — in deriving a relation like equation (3), which would also yield too a charge quantization). Here, let us mention only the following, in order to support our equation (2): (i) The Dirac relation eg = nh/2 (or the analogous one by Schwinger) does come in the theory only when magnetic monopoles are supposed to be subluminal ; (ii) On the contrary, if magnetic monopoles are considered to be Superluminal, then 'extended relativity' seems to yield the alternative relation g = ne. In fact, let us eventually quantize our theory by using Mandelstam's method, i.e. by following Cabibbo and Ferrari (1962). In that approach, the field quantities describing the charges (in interaction with the electromagnetic field) are defined so that: 4> (x, P') = <f> (x, P) • exp [ - 1 J F^ dcr M „J where S is a surface delimited by the two (space-like) paths P, P' considered, ending at point x. In other words, the field quantities <f> are independent of the gauge chosen for the fourpotential A^ but are path-dependent. When only (subluminal) electric charges are present, then F M „ = A v/lli -A^^ and equation (4) does not depend on the selected surface S (but depends merely on its boundary P-P'). However, if subluminal magnetic monopoles are present too, then F M „ = A v/ll -A^ v - ie^B^,, (where B^ is a second fourpotential), and the following condition must be explicitly imposed: exp [-fL^^J wherefrom Dirac 's relation eg = nh/2 follows. At this point, it is immediate to realize that, if 'magnetic monopoles' cannot be put at rest, as in the case of tachyon monopoles, then equation (4) is again automatically satisfied, without any recourse to Dime's condition. 324 Uncertainty Principle and Foundations of Quantum Mechanics CONCLUSIONS According to (extended) relativity, all the experimental searches for magnetic monopoles should be done, or redone, by actually looking for 'tachyon monopoles'; i.e. taking into account the newly proposed kinematics (faster- than-light speeds) and the possibly much lower value of the apparent magnetic strength. In particular, the 'tachyon monopoles' will probably suffer in an electromagnetic field the 'Lorentz force' F = gH-gVAE, where however, V 2 >c 2 . Actually, Bartlett and Lahana (1972) have tried already to look for 'tachyon monopoles', but in vain because the basis of their theoretical assumptions — Cherenkov radiation supposedly emitted by tachyons in vacuum (!) — is incorrect as has been shown by us (Mignani and Recami, 1974a). More details can be found in Recami and Mignani (1976) and in the proceedings (to appear) of the interdisciplinary seminars on 'Tachyons and Related Topics' delivered at ERICE (September, 1976). ACKNOWLEDGEMENT The authors are grateful to Dr. S. Chissick and to Dr. E. Papp for their kind interest. REFERENCES Bartlett, D. F. and Lahana, M. D. (1972) Phys. Rev., D6, 1817. Cabibbo, N. and Ferrari, E. (1962) Nuoro Gmento, 23 1147. Dirac, P. A. M. (1931) Proc. Roy. Soc., A133, 60. Dirac, P. A. M. (1948) Phys. Rev., 74, 817. Mignani, R. and Recami, E. (1974a) Lett. Nuovo Gmento, 9, 362. Mignani, R. and Recami, E. (1974b) Lett. Nuovo Gmento, 9, 367. Mignani, R. and Recami, E. (1974c) Lett. Nuovo Gmento, 11, 417. Mignani, R. and Recami, E. (1974d) Lett. Nuovo Gmento, 11, 421. Mignani, R. and Recami, E. (1974e) Nuovo Gmento, A24, 438. Recami, E. and Mignani, R. (1976) Physics Letters, 62B, 41. Recami, E. (1973) Annuario 73, Enciclopedia EST-Mondadori (Milano), p. 85. Recami, E. and Mignani, R. (1974a) Rivista Nuovo Gmento, 4, 209-290; [Erratum] 4, 398. Recami, E. and Mignani, R. (1974b) Lett. Nuovo Gmento, 9, 479. Pavsic, M., Recami, E. and Ziino, G. (1976) Lett. Nuovo Gmento, in press. Schwinger, J. (1966) Phys. Rev., 144, 1087. D7 Problems in Conf ormally Covariant Quantum Field Theory W. RUHL and B. C. YUNN Universitat Kaiserslautern, Germany 1. INTRODUCTION The conformal group appeared in physics as early as 1909 when Cunningham (1909) and Bateman (1910) first noticed that Maxwell's equations are not only Lorentz covariant but also covariant under the larger conformal group. This consists of the usual Lorentz transformations, translations, dilations and special conformal transformations. Since then many attempts have been made to somehow utilize this group in physics (Kastrup, 1962, 1964; Wess, 1960; Fulton and coworkers, 1962; Mack and Salam, 1969). We are particularly interested in the possibility of constructing a local quantum field theory which is also conformally covariant. One particular feature of such a field theory is that it possesses global operator product expansions of the Wilson type (Wilson, 1969). This may have far-reaching consequences in more realistic field theories involving non-zero masses as well. The requirement of the conformal symmetry is so strong that the most general two- and three-point functions are determined completely up to arbitrary normalization constants. Therefore, for example, their analytic struc- tures can be studied unambiguously. So far there are two non-perturbative approaches of analysing a conformally covariant quantum field theory. One is the so-called bootstrap approach which tries to construct general n -point functions from the skeleton graph expansion using the conformally covariant two- and three-point functions. This was initiated by Migdal (1971), Mack and Todorov (1973) and Polyakov (1969). In this approach one can indeed prove that every term in the expansions is ultraviolet convergent if one restricts the anomalous dimensions of the fields to a certain range. Thus the construction is term by term conformally covariant. The dimensions and coupling constants, however, are not free parameters, instead they are determined from self- consistency conditions which arise from integral equations for the two- and three-point functions. The main drawback of this bootstrap approach lies in the inability to handle the infinite series appearing in the expansions just as in conventional perturbation theory. 325 326 Uncertainty Principle and Foundations of Quantum Mechanics Another approach adopted by Mack (1974) to avoid this difficulty starts by writing down an infinite number of coupled integral equations for Euclidean Green's functions and solving them by making use of conformal partial wave expansions. A remarkable observation is that in this way one can diagonalize the whole set of integral equations and thereby reduce them to a set of algebraic equations for the partial wave amplitudes. A careful analysis shows that these waves must possess some poles, and the factorization property of their residues also follows when one considers them as analytic functions of the representa- tion parameters. This in turn enables one to derive asymptotic operator product expansions with a certain additional assumption. This program, inten- sively pursued in recent years by Dobrev and coworkers (1975a, 1975b), has also some intrinsic difficulties. In particular imposing crossing symmetry on the partial wave amplitudes is difficult to carry out. In any case it remains to be seen whether the latter difficulty is easier to handle than the infinite summation problem in the bootstrap approach. Formulating the conformally covariant quantum field theory directly in Minkowski space provides additional difficulties connected with the fact that the conformal group is larger than the group of causal automorphisms of Minkowski space. The difficulties already manifest themselves in free field theories. Until recently it was considered that one was either forced to move on Euclidean space or to restrict oneself only to infinitesimal transformations thus inventing a terminology like 'weak conformal invariance' (Hortacsu and co- workers, 1972). Detailed studies (Schroer and Swieca, 1974; Kupsch and co- workers, 1975), on the free fields and some explicitly soluble interacting field theories in two-dimensional space-time has made it possible now to under- stand the structure involved in it. It is generally accepted that the necessity of the universal covering group of the conformal group is essential. The fields are subjected to a non-local Fourier decomposition on the centre of this group which generalizes the concept of decomposition in a creation and an annihila- tion part of the free field. Operator product expansions in Minkowski space have to be studied not in terms of fields but in terms of these non-local projections. This makes the whole scheme very complicated. An interesting question is whether one can somehow recombine all these components into a local expression, and also the question of the convergence of the operator product expansion is well worth pursuing further. Our plan is as follows. In Section 2 some difficulties associated with the fact that the conformal group does not preserve the causal structure of the Minkowski space Mr, (D = dimension) are discussed. The universal covering of a compactified Minkowski space (denoted y£) with its causal structure that is locally isomorphic to that of M D and is invariant under the universal covering group of the conformal group is introduced and the possibility of defining a field theory on M£ is also discussed. The transformation properties of quantized fields are investigated in Section 3. A significant role is played by the generating element Z of the centre in formulating the non-local decomposition of the field operators. The Thirring model is introduced as an explicit example of a Riihl and Yunn 327 conformally covariant field theory. In the subsequent Section 4 this model is used in our study of operator product expansions in Minkowski space. In two-dimensional space-time the conformal group and its covering are small enough to carry out necessary computations explicitly and this makes it easy to show the local structure of operators appearing in the expansions. In Section 5 we try to develop some general model independent ideas on the operator product expansions. 2. CAUSAL STRUCTURES Causality as a geometric concept is a partial ordering in Minkowski space, or more general of a manifold. We are interested mainly in the Minkowski space M 4 , but for the sake of constructing models other Minkowski spaces M D with the characteristic form -xl -r 2 - ■xl- _ r 2 (1) are also of importance. Automorphisms of the manifold, in particular of M D , that together with their inverses preserve the causal ordering are called causal automorphisms. Zeeman's theorem (Zeeman, 1964) asserts that the group of causal automorphisms of M 4 (relative time-like pairs of vectors are ordered into an 'earlier' and a 'later' vector, relative space-like pairs are not ordered) consists of orthochronous Lorentz transformations, translations and dilations. This group we call the 'Weyl group'. The conformal group possesses the Weyl group as a proper subgroup and thus violates the causal ordering. Zeeman's theorem can be generalized to Minkowski spaces with D > 2 easily. It has for a long time been interpreted as forbidding any extension of the space-time symmetry beyond the Weyl group, in particular excluding any internal sym- metry combined with space-time symmetry. The conformal group consists of products of inhomogeneous Lorentz trans- formations, of dilations x% = \x», A>0 (2) and of special conformal transformations x» = <T(b,xy 1 (x" + b»x 2 ) (3) a(b,x) = l + 2b tL x' i + b 2 x 2 (4) It is obvious that M D is not a homogeneous space for these transformations, since whatever we choose for b in (3), there are vectors x e M D for which cr(b, x) = 0. Compactifying M D evades this problem but leads to a manifold that trivially does not possess a causal ordering which extends the causal ordering of M D , since the time axis is nowclosed at infinity. A useful parametrization of M D and M D , the compactification of M D , is defined as follows (Riihl 1975). Introduce the Hermitian 2x2 matrix X=X°O- + X€T, o- =1 (5) 328 Uncertainty Principle and Foundations of Quantum Mechanics and U = (o- +iX)(o- -iX) * U=e iv/2 u, detH = l, 0=£<p<2ir (6) (7) is (for M 3 (Ma) define Xl (x t and x 2 ) to be zero). Then the compactification M D obtained from M D by adjoining all U with det(o- + U) = 0. As can be seen from (7) and the parametrization « = «°(r„+n<T, («°) 2 +u-u=l (8) « D has the topological structure §iXS D -i (9) This is also true f or D > 4. Despite these difficulties with causality physical models are known that are quantum field theories in the proper sense and exhibit conformal covariance with an invariant vacuum, namely the free massless operator fields. Beyond these free theories a few models with interactions in M 2 are known to be conformally symmetric. In the framework of quantum field theory and free field theories are limiting cases as comes out in the following fashion (Ruhl, 1973). We assume that a conformally covariant quantum field theory in the sense of Wightman is given, the vacuum is invariant. We consider the state *S(*)|o> (10) where 3>i(x) is any spinor field operator. Due to the spectrum condition this state can be analytically continued in x into the tube domain which is a homogeneous space for the conformal group and its universal covering group. If the conformal group acts on it, it transforms as an analytic representation, i.e. a representation of the discrete series (Ruhl, 1973; Mack, 1975). Such rep- resentations are labelled by three parameters: j u j 2 , d, where j x (j 2 ) is the undotted (dotted) spin as for a spinor representation of SL(2, C), and d is the 'dimension' of the field. Since one wants d to assume not only integral or half-integral values, one has to study the universal covering group of the conformal group. This group is denoted G D in the following. The invariant two-point function <Omx)m(y)T\0) =\y\ 2 S5&x -y) (ID is fixed by group theory alone up to a positive normalization constant to be an intertwining operator for the discrete series representations of G D . It is a homogeneous distribution in x-y of degree -2d. The requirement that this distribution is positive, is equivalent with the requirement that the discrete, series representation involved admits an invariant norm in a Hilbert space, i.e. Ruhl and Yunn 329 it is unitary. This entails that the dimension d is bounded from below, namely d>j 1 +j 2 +2 if/i/ 2 #0 (12a) </>/'i+/2 + l if 7172 = (12b) At the lower bound (12b) of d degenerate representations appear that, as we shall see, belong to free fields. In fact, from (11) we deduce the vacuum expectation value of the causal commutator (or anticommutator). The usual connection between spin and statistics can be verified. If d assumes the lower bound value (12b), the commutator (anticommutator) function assumes the canonical form for a free massless field. Due to the theorem of Jost, Schroer and Pohlmeyer (Jost, 1961; Pohlmeyer, 1969), the field itself is a free massless field in this case. Another conclusion can be drawn from (12b) and the homogeneity of the two-point function. A conformally covariant quantum field involves asympto- tic states carrying particles if and only if it is a free massless field. This reduces the value of such field theories considerably. We can either regard them as models of academic interest only, some of whose properties can hopefully be carried over to more general quantum field theories with particles, or they appear at best as limiting theories in the Gell-Mann-Low sense of realistic quantum field theories (Gell-Mann and Low, 1954). For free fields causality does not cause any problem. Their commutator (anticommutator) is a number-valued distribution and this is conformally covariant. All n -point functions can be expressed by these two-point functions and are automatically covariant. For a deeper inspection we make the ansatz A'B' (13) i.e. a 'local transformation' under the conformal group. It involves a singular multiplier n(g, x) as soon as special conformal transformations (3), (4) partici- pate in the group element g. It involves a negative power of <r{b, x) (for the exact expression see Section 3) and thus is singular whenever x g (3) is singular. In the case that we project both sides of (13) on the vacuum from the right, the singular multiplier is a boundary value of an antiholomorphic function in the forward tube domain, and if we project it on the vacuum from the left, we obtain a boundary value of a holomorphic function on the forward tube domain. It follows that the singular multiplier in (13) cannot be given a unique meaning, since the multipliers are different in either case. For free fields the projections on the vacuum can equivalently be performed by decomposing the field into its positive and negative frequency parts and to write an equation of the type (13) for each part separately. As multipliers we use the appropriate boundary values. As we shall show in the subsequent section, in the general case the field operator has to be harmonically analysed on the centre of G D instead, and each Fourier component transforms as (13) with its specific multiplier composed of both boundary values in general. 330 Uncertainty Principle and Foundations of Quantum Mechanics Though it is not necessary it is quite useful both for technical and illustrative purposes, to formulate a conformally covariant quantum field theory by maintaining the form of the transformation law (13) without Fourier decom- position of the field and with a unique multiplier, by introduction of fields on the universal covering space Mg. This amounts essentially to letting <p in (7) assume all real values from -oo to +oo. Instead of (9) we get the structure RiXSo-! ( 14 ) In the case D = 2 we have to take the universal covering with respect to both factors Si and thus obtain In the latter case U (7) is a diagonal 2x2 matrix /e' v+ \ Mo e-J with *f = *°±* 3 (16) (17) In this case we let both <p ± assume all real values. The manifold M"£ has a remarkable property first discovered by Segal (Segal, 1971- Mayer 1974). It possesses a conformally invariant (under G D ) causal ordering in the sense described above. For D>2 MS possesses an infinite number of sheets labelled n = 0, ±1, ±2, .... A space-like vector of the zeroth sheet can be mapped by continuous variation of the group element on other space-like vectors on the same sheet, onto a point at infinity, and further on points of the ± -first sheet that lie over time-like vectors of the zeroth sheet. If we identify these points on the ± -first sheet with the points of the zeroth sheet, we have transformed space-like into time-like vectors and thus violated the causal ordering of M D (the second vector of the pair can be taken to be the null-vector). However, on MS we may call these points on the ± -first sheet obtained from space-like points on the zeroth sheet by means of special conformal transformations also space-like. It is then easy to see that the remainder of the manifold MS can be cast into a future and a past submamfold plus a light-cone. Thus we have succeeded in extending the causal structure from M D onto MS- uc . ,_ . This way it is possible to define a quantum field $(£) on Mb with x over x on the nth sheet, such that locality can be formulated by [*(£), *(y)] = (18) whenever x, y are relatively space-like say for a scalar field. This locality condition (18) can be postulated to be invariant under G D . Any Wightman m -point function <0|*i(*i)* 2 (x2).-.*m(*m)|0> (19) Ruhl and Yunn 331 is independent of the sheet number n if all fields are defined on the same sheet. Field theories on fixed sheets are isomorphic. In fact there is a unitary operator Z so that ^{x) = Z n ^ n ^{x)Z' (20) if x (x) lies on the n^st (n 2 -nd) sheet over x. Any conformal transformation that does not lead any of the arguments of the m -point function (19) out of its sheet, leaves (20) unaltered. From this one can deduce that Z commutes with all conformal transforma- tions and thus represents an element of the centre of G D . In fact, it represents the generating element of the centre. Of course, any local observable should be identical on all sheets and thus commute with Z Finally all Wightman m -point functions with arguments on arbitrary sheets can be obtained from the same function with all arguments on the zeroth sheet by analytic continuation. This and the previous assertions can be proved (Liischer and Mack, 1975) by first requiring conformal covariance only under infinitesimal transformations, then continuing the Wightman (or time-ordered) functions into the Euclidean domain, where the generators of the conformal group can be implemented easily to the Euclidean conformal transformations, finally continuing back to the Minkowskian boundary, which then turns out to have the sheet structure just described. In the case D = 2a few alterations are necessary. There is a doubly infinite sequence of sheets n + = 0, ±1, ±2, . . . and «_ = 0, ±1, ±2, . . . and correspond- ing operators Z+ and Z_ as the intersheet isomorphisms. In this case G D is the direct product of two groups Go = SU(1, D" c x5C7(l, 1)™ (21) (22) (23) (24) The first (second) group acts on <p+(<pJ), according to e * = — — = /3e'*+a <p g = <p + 2arg(a+/?e~'' p ) where arg a is allowed to range over -oo < arg a < +oo in order to obtain the universal covering group of SU(1, 1). 3. LOCAL FIELDS AND THEIR TRANSFORMATIONS Physically relevant unitary irreducible representations of G 4 , the universal covering group of the conformal group, can be constructed in the usual fashion by inducing them from an appropriate subgroup. Requiring that these rep- resentations be realizable on spaces of functions of vectors x in Minkowski 332 Uncertainty Principle and Foundations of Quantum Mechanics space i e by classical fields, we are led to consider the stability subgroup of these 'vectors. For x = this subgroup consists of homogeneous Lorentz transformations, of dilations, and of special conformal transformations The representations obtained in this fashion have been classified as follows (Mack and Salam, 1969). Let /<„ be the generators of the special conformal transformations (3) represented by a matrix acting on the classical field at x = 0. Then K „ may be identically zero, this type of field representation is called la in Mack andSalam (1969) Secondly #c M may be non-zero but a finite dimensional matrix. Then it has to be nilpotent due to the abelian structure of the subgroup of special conformal transformations. Such representations are denoted lb. Finally there are infinite dimensional matrices k„ these are denoted type II. Representa- tions of type la are the representations almost exclusively encountered in field theory If we require in addition that the energy-momentum spectrum be restricted to the forward light-cone, we obtain the discrete series representa- tions mentioned in Section 2 that were used there for the one-particle states (Riihl 1973- Mack, 1975). We mention finally that the group G 2 and its representations that are used for the Thirring model in M 2 have been studied by many authors. We shall only explain a few notations in this article but otherwise refer to an exhaustive presentation in the literature (Riihl and Yunn, 1975a). The transformation property of a conformally covariant spinor field under special conformal transformations of G 4 is by representation theory U g ^ A6 (x)U- g 1 -<r(b,x)- d - i ^ I DH(To+XB) AA 4> AB {x g )DH(r +BX) B . B (25) B = b°<r -ba (26) for group elements sufficiently close to the unit element. D' denote representa- tion matrices of covariant spinor representations of SL(2, C). As pointed out in the preceding section, the difficulty consists in interpreting the singular factor p-d-h-h, w hich for fixed x and a sufficiently small neighbourhood of the group unit of G 4 is regular and well defined by arg a(b, x) = 0. The solution to the general problem has been found by studying the Schroer model (Schroer and Swieca, 1974) and the Thirring model (Kupsch, Riihl and Yunn, 1975) in M 2 . We shall describe it now in general terms. We assume that we have a Wightman-type field theory that is conformally covariant, i.e. there exists a unitary representation U g of G D with U g \0) = \0) (27) The generating element of the centre of G D (D >2) be represented by Z. We introduce the Fourier component $ T , =£ t < 1 on the centre of G D <D T (x)= I Z n <&0c)Z-" e" (28) Obviously Z<D T (x)Z- 1 = e 2mT $ T (x) Riihl and Yunn 333 (29) We continue the equation (25) away from the group unit so that x on the zeroth sheet over x e M D moves over to the first sheet (minus first sheet). We consider the two boundary values a-JJb, x) = lim a{b, x ± iy) y-0 (30) where y tends to zero in the forward light-cone. We obtain then on the first (minus first) sheet arg<r±(b,x) = Tir(±ir) (31) In accordance with (29) we make therefore the ansatz U^bU; 1 = <r + (b, xp"- h+T a.(b, *)-**-'*-* X I D h (<To+XB) AA ® T A - B {x g )DHcro + BX) B . B (32) A' B ' We can introduce field operators on Mp by setting on the zeroth sheet <P AB (x) = 2 2(d " 1) |det (cro- HOI- 1 -''-* x I D\Acr -iX)f A . B {U{X))D% B (ao-iX) (33) A' B ' and requiring a 'local' transformation law U g f AB {U)U- s l = |det (A f + I/BT^Idet (CU+D)! - *"^ x £ D>i{A*+UB*) AA .f A +{U a )D'H£U+D) B . B (34) Here A, B, Q D denote 2x2 matrices making up a 4 x 4 matrix (AB\ m ■ \CD! H=(~°° ° Y m f H = Hm- 1 \ +cr / (35) (36) that belongs to SU(2, 2). The group 51/(2, 2) uc is isomorphic to G 4 . The matrix m with B = C = 0,A=D = i<r o generates the centre of SU(2, 2) uc . We find from (34) Zf^iu, <p)Z~ x = e^^W-", <P ~2ir) (37) (38) 334 Uncertainty Principle and Foundations of Quantum Mechanics by means of U^iAU+BKCU+D)- 1 (39) <p g = ( p-argdet(A t +C/£ t ) -argdet(CT/+£>) (40) Of course the field operator f A s(U) can also be decomposed on the centre of G 4 by a formula such as (28). It follows then from (38) = e" r(/W ' ) / T (-w,<p-27r) Therefore r(U) can be expanded in the canonical basis T(«,<P)= I Z Z <e iw %( M ) 7=0, |, 1... r, i = -/4 = -» with 5(;'2-/i)+/'-9 = Tmodl (41) (42) (43) This expansion is in many cases more advantageous than a Fourier decomposi- tion into plane waves on M 4 . The central element Z is an element of a one-parameter subgroup of G D and as such can be written in the form Z = e 2 (44) The self-adjoint operator T is not uniquely determined but only up to a self-adjoint operator with entire eigenvalues, such as a number operator in a free field model. T has always the character of an 'anomalous part' of such a number operator in the known models. The eigenvalues of T and its corres- ponding eigenspaces fix the irreducible components of a local operator. In the known models its spectrum is discrete and it has been suggested (Luscher and Mack, 1975) that this be so in general. It has also been proved (Luscher and Mack, 1975) that the spectrum of T can be assumed to be positive. We shall make use now of the hypothesis that the spectrum is discrete. If A, denote the eigenvalues of T and n(A,) their respective projection operators on the eigenspaces, we define It follows then that ^ KlX =U(\ 1 )^Il(X 2 ) z^z-W^'-^^ and inserting this into (28) T = A!-A 2 mod 1 (45) (46) (47) Riihl and Yunn 335 and finally $ T = Z *a,a 2 (48) AlA 2 A1-A2-T As an example we want to present now the Thirring model (Kupsch, Riihl and Yunn, 1975 ; Riihl, 1975). We shall use the formalism by means of fields on M" c - The main tool in the construction of the Thirring field is the current operator (Dell'Antonio, Frishman and Zwanziger, 1972) that transforms as a free mass zero vector field. Its components /+ and /_ depend on <p+ and q>- respectively only and, expanded in the canonical basis, are 1 OO 1 /*(*>*) = - Z (w + l) 5 {cl, m e' (m+I ^ + c ± , m e IT m=0 f or (50) (51) j ± (<p ± ) =/ ± - ) (<pJ+/ ± +) (^ ± )+-<? ± IT The operators c±, m , cl, m satisfy canonical commutation relations, e.g. and commute with the charge operators Q ± . We introduce the 'sources' of the current operators by itW = 7rf" ± d^/l-ty) = YtX<pJ (52) J+ioo Then we define the Thirring field (y = 1, 2) by Riihl and Yunn (1975b) f y (<p) = 2 d exp 1 Z C±, y [lt\cp ± ) +§0 ± <p ± ] ± x * y exp i Z C ± , Y [ri + V*) + ±Q±V±\ (53) ± where the coefficients C±, Y are defined by Q ± <r y = a y [Q ± + C ± , y ] (54) <r y commutes with the source terms (52). Under a space reflection we require that the two components of a y inter- change and that Q± goes into Q^. This necessitates C ±)1 = CV, 2 :=C ± (55) which leaves two free parameters to the model. The spin and dimension of the field operator (53) are d = %Cl + Cl] (56) s=\\Cl-Cl\ (57) (58) 336 Uncertainty Principle and Foundations of Quantum Mechanics By differentiation of (53) we find four equations -ijfJM = ^4(^W+^o ± )/>) + />)(/ ± + W+^Q ± )] two of which can be shown to be identical with the field equations (Thirring, 1958) - »a M y^(x) = gy"tf:\xMx) + ^(x)J ( ;\x)] (59) whereas two equations are additional. The coupling constant g is g = 27rC- (60) The operator ay is a constant field belonging to d = s = as can be seen from (56), (57). It satisfies abnormal commutation relations (Klaiber, 1968; Low- enstein and Swieca, 1971) {oi,o 2 } = {oi,o 2 } = [01, oI] = [o 2 ,o 2 ] = — Z7T (61) and has the Wightman functions (OKa.no-lflO) = <0|(o 2 )" (o 2 )"]0> = (27r)- n S nn . (62) The transformation behaviour of the Thirring field (53) under the conformal group follows from the canonical transformation behaviour of the current, from the invariance of the vacuum state and the charge operators Q ± , and from an appropriate definition of the transformation behaviour of the <r-field. This takes account of the non-invariance of the subtraction point ±100 in (52), and is consistent with the invariance of the Wightman functions (62) and the com- mutators (61) (Riihl and Yunn, 1975b). We obtain UJ y {<p)U- 1 = n{|« ± e^+/3 ± r c H/> + *, <P- g ) (63) Moreover we have from (53), (54) /> ± + 2ir,*= F ) = e fa< H(?)e „->>Qi (64) In fact the finite conformal transformations (63) can be obtained from the energy momentum tensor (Dell' Antonio and coworkers, 1972) in 'Sugawara form' ©^ = v : J J, : -7.tr- JJ K : g„ v (65) by the canonical (that is: free massless field) formulae for the generators by Riihl and Yunn 337 exponentiation. Among these generators we find the combinations T ± = I (m + l)cl, m c ± , m +^ m=0 (66) that create one-parameter subgroups containing the central elements Z+ and Z_ of G 2 in agreement with (64). From (54) we see that the eigenspaces of Q ± belong to the eigenvalues A±(«i, n 2 ) = n 1 C ± + n 2 C T (67) («i,2 = 0, ±1, ±2, . . . ). They are obtained by applying a^ n r times (respectively 01" (-«i) times) and a 2 n 2 times (respectively cr 2 (-n 2 ) times) to the vacuum state and operating with arbitrary polynomials of the currents on these states. It follows that the operator f y {<p)Y\.(\. ± (n u n 2 )) has one covariant component only with T ±>y ^ C^nxCt + iijO + iCi ±.y mod 1 (68) A crucial property of the Thirring model is that any product of operators f y or their adjoints f y can be regularized by splitting off a regular factor R nfyitii) YlfM) = SM*yitl<Pih W]R { yi};{y;U<Pih {*>}] (69) that is C°°, multifocal, and conformally covariant in all variables. The singular factor 5 is a covariant distribution. Identifying arguments in R leads to other local conformally covariant operators. Applying derivative operators to R before identifying arguments leads to local, but to conformally covariant differential operators only if the differential operator is itself covariant in the sense specified in the subsequent section. For the Thirring model the conformal analysis of operator products reduces therefore to the analysis of the reg- ularized products. 4. COVARIANT OPERATOR PRODUCT DECOMPOSITIONS Products of local operators A(x)B{y) and their singular behaviour if x approaches y have been studied for two different purposes. The first approach was motivated by phenomenology, namely the investigation of high energy asymptotic behaviour of certain matrix elements of such operator products (e.g. deep inelastic electron proton scatter- ing). This approach aimed at asymptotic expansions of the type A(x)B(y)=* I s n (u)C n (v),u=x-y,v=j(x+y) n=0 (70) either for u -> ('short distance expansion' or Wilson expansion) or for w 2 -» 338 Uncertainty Principle and Foundations of Quantum Mechanics ('light-cone expansion'), see Wilson (1969) and Wilson and Zimmermann (1972) respectively, Brandt and Preparata (1971) and Frishman (1971). Both kinds of expansions have been studied in the framework of perturbative quantum field theory (Zimmermann, 1970, 1973). The singularity of the function s n (w ) decreases with increasing n, whereas C n (v ) are local operators. The second approach is fundamental (Polyakov, 1973; Efremov, 1968; Ferrara and co-workers, 1973; Bonora and co-workers, 1973; Swieca, 1974; Schroer and co-workers, 1975). If expansions of the type (70) for all local operators in a Wightman formalism together with all the two-point functions for these fields are given, then the structure of the quantum-field theory is fixed provided the validity of the expansion (70) is not only asymptotic but in the sense of weak convergence in some real or complex domain. In fact, all n -point functions can be reduced into two-point functions this way. Within a conformally covariant quantum field theory this programme seems to have a chance to be set up successfully. In fact, fixing a normalization of the local fields by any ad hoc prescription, all two-point functions are uniquely determined. Moreover, requiring each term in the expansion (70) to be conformally covariant reduces the number of local fields and restricts the form of the singular functions. We investigate such a programme in this section, considering the Thirring model as a guide. We restrict the investigation to M 2 . In order to derive an expansion of the Wilson-type whose terms are each covariant (or semicovariant, as we shall see), we intend to apply the tensor product decomposition theorem for the conformal group. First we project out covariant components A Ta {x), B T «(y) of A(x) and B(y) as explained in the preceding section. For these components we make an ansatz A T Hx)B T »(y)~ I \^zQ{x A ,x;xB,y\x(n),z)adz) (71) where the kernel Q satisfies the covariance constraints. Cf{z) is a covariant component of a local operator with T C =r A +r B mod 1 (72) Xa,b and xM denote the representations of G D involved. Finally there remains the problem of recombining the components (7 n c (z ) to a local operator. The tensor product decomposition theorem is used both to derive the kernels Q and the operators Cfiz), where for the latter part of the problem explicit knowledge of the quantum field model under investigation is presumed. We outline the derivation of the expansion (71) for models in M 2 by group theoretic arguments (Riihl and Yunn, 1975a, b, c). The first tool we need is an asymptotic completeness relation for covariant kernels of the second kind. We consider a space of C°° functions f(<p), -oo < «p < +oo with /(<p+27r) = e 2 ' ri y(«>), 0=sr<l (73) Riihl and Yunn 339 with an appropriate set of norms. We denote it 3) T . It carries a representation X = (/', r) of SU(\, l) uc if we define 2V(*) = l«e"+iB|*- 1 /fo„) (74) with <p g as in (22), (23). If j is purely imaginary % can be completed to a Hilbert space with invariant scalar product f 2,r (/i,/ 2 )= fi(<p)f 2 (<p)d<p Jo (75) These representations form the principal series of SU(1, l) uc . If l-/=Fr^0modl (76) <3> T possesses invariant subspaces 9^ respectively, spanned by the canonical basis elements e^, q - t & mod 1, with ±<Z = 2-/' + m, m =0,1,2,... (77) They carry the discrete series representations. Tensor products of spaces 2 T can be mapped into spaces 2> T by means of operators K which we call covariant if K(T x g ixT x g *) = T x g *K (78) Such covariant operators can be given in the form of convolution kernels. They span themselves a two-dimensional linear space. As a basis we take the following two kernels (arg (2/ sin (<p - iO)) = -arg (-2/ sin {<p + iO)) - v/2 for 0<<p<ir) (ZnyKiixs, (p 3 \xu <pu x 2 , <Pi) = [2i sin §((?! - <p 2 - iO)Y l ~ h+T2 x [-2/ sin §((?! - <p 2 + iO)Y h ~ h ~ T2 x [2/ sin U<P2 - <P 3 - iO)P +h ~ h+h x [2/ sin |(<p 3 ~<Pi - iO)T i+h+Tl+T2 x [-2/ sin |(^3 ~ <Pi + iO)T h+h ~ Tl ~ T2 (79) (2ir) 3 K 2 (x 3 , <p 3 \xu <PuX2, <Pi) = [2/ sin U<Pi~ <Pi ~ iO)T* " /l_Tl x [-2/ sin !((?! - <p 2 + /0)] _ ''2- y 3 +T i x [2/ sin \{<p 2 -<p 3 - iO)} - * + ' 3 ~ Tl-T2 x [-2/ sin \{<p 2 -<p 3 + ioy\ + ' l ~ h+Tl+T2 x [2i sin \{<p 3 -<p r - /0)]- J -" + ' 2+ '3 (80) 340 Uncertainty Principle and Foundations of Quantum Mechanics [In Riihl and Yunn (1975a, b, c) the kernel K 2 was denoted K 3 \] Note that t 3 is necessarily Ti + r 2 mod 1 . If all representations are in the principal series, the operators are well defined. For other representations we obtain the operators K by analytic continuation in /. This method is used throughout, the spaces 2) T that are independent of / are particularly useful for this purpose. A kernel K may develop poles and zeros during this continuation. Poles may either be related with the appearance of invariant subspaces &? (r-type poles) or at positions depending only on A, / 2 , / 3 (/'-type poles). Residues of /-type poles may turn out to be differential operators that are also denoted covariant. They can be used to construct new local covariant operators from multilocal covariant operators. Dual covariant operators for the mapping K d T x * = {T£xT£)K d (81) can be obtained from the covariant kernels by replacing /,->-/,, T;-»-r,- (/' = 1, 2, 3). Both types of kernels together satisfy a completeness relation (2tt) 2 7T •I djuG^psJ d(p 3 X{K d (Xl, (PU X2, <P2|*3, <P3)Ki(X3, (PilXl, <PuX2, <Pl) -Kfixu <Pu X2, <Pi\Xi, <P3)K 2 (X3, <P3\Xi, <Pi; X2, <p'i)} + discrete series terms +00 "*"°° = £ e 2m ' T ^5(<p 1 -( P i-27rfc 1 ) I e 2 ™^8{<p 2 - <p' 2 -2nk 2 ) (82) where dpGt- 3 ) PS is the principal series part of the Plancherel measure (j 3 = v) p sh 2irp dfi(x3)ps = ch277-p + COS27TT 3 dp (83) Applying K x and K 2 to a regularized bilocal covariant operator, we obtain covariant and in general non-local operators (due to the integration, there is in addition always the non-locality from projecting on eigenspaces of T+ and TL). Inserting them into (82) allows us to reconstruct the original operators. Shifting the contour of (82) to any direction, yields contributions from the poles (four sequences of /-type and two of r-type poles for either variables <p ± ) plus a residual integral. Both with increasing and decreasing Re/ 3 the degree of singularity of these contributions grows. Therefore we do not obtain a covariant Wilson expansion. To overcome this difficulty we proceed as in the theory of Regge poles (Riihl, 1969). We exploit the symmetry of the integrand in (82) under ;' 3 ->-/ 3 and replace K 12 by kernels of the second kind Ki=Q la + Q lb K d 2 = Q 2a + Q 2b so that Riihl and Yunn 341 (84) ^(QlaK^Q^KJ goes into t<p 3 {Q2bKi-Q lb K 2 ) Jo under / 3 -> —j 3 . In turn we have explicitly Q la = aKi + b exp {-iir(j\ -j 2 + Ti + t 2 ) sign sin (<pi - <p 2 )}AT 2 with Q 2a = exp {mt(/i -j 2 + T X + t 2 ) sign sin (<p x - <jp 2 )}Qi _ sin 7r(|-/i+/ 2 +/ 3 ) sin ir(&+ h~ t^ t 2 ) sin 2irj 3 sin 7r(j\ -j 2 + r t + r 2 ) (85) sinff(s+/i-y 2 +/ 3 ) sin 7r(| +/ 3 + r t + t 2 ) (86) sin 2irj 3 sin w(Ji ~h +U + t 2 ) From (85) we see that the kernels Q a are not globally covariant but only infinitesimally, namely whenever sin( ( p 1 -<jp 2 )#0. We call them therefore 'semicovariant'. For <pi-><p 2 we have asymptotically (as a distribution in <p 3 ) \Q la \ - const |2 sin J(<9i -^a)!"*"*" h+il+h (87) whence a decreasing singularity in the left half plane. Shifting then the contour in (82) we obtain the asymptotic completeness relation +0O 4-oo S e 2mV ^5(^ 1 -^' 1 -277fc 1 ) I e 2m ' T ^(? 2 -«p 2 -27rtfc 2 ) fcl— oo * 2 =-oo = (27r) 2 exp {iV(|-/ 2 + r 2 ) sign sin (<p x - <p 2 )} oo /_j\fc + i X 2 — TT— n2j 3 -k)i 3 ltgwU 3 + T 1 + T 2 ) + tgTT(j3-T 1 -r 2 )] k=0 K\ J'2ir <i 2 ir d<p 3 \ d<p' 3 Q la (Xu<Pi;X2,<P2\X3,<P3)S(X3,<P3\X3,<P3) o Jo x A(* 3 , <p' 3 )\xu <PuX2, *>2)U-/.(fc) (88) where Xl = (-/ 3 , t 3 ) (89) 342 Uncertainty Principle and Foundations oi Quantum Mechanics and h(k) = k-h-J2 + k (90) 5 is an intertwining operator, i.e. a continuous operator from ®„ into 3 n that intertwines xl and * 3 . A is a covariant differential operator d<px d<p 2 ACt3,<P3lA'l.<Pi;Ar2,<P2)g(<Pl»<P2) Jo Jo = Qk( ~i — > -i—)s(.<Pi> <P2)Ui=^=^ ( 91 ) \ 3<pi 0<P2/ I I«P1 = <P2 = < P3 where O k is a hypergeometric polynomial Gkfal,92)= I (-D m ( fc )(2yi-fc) m (|-/2-42)m(2/2-fc)«c-m(|-/l-<7l)fc-m ™=n \TYl J (92) The asymptotic completeness relation (88) is applied twice, once to the variables labelled (+), once to those labelled (-) in a regularized covariant operator R(q> l+ , «Pi-, <p 2+ , <P 2 -)- Of course we have first to project on a simul- taneous eigenspace of T + and 71. For a light-cone expansion it suffices to apply (88) to one variable only. In the completeness relation we have then the operators 0(xl(k + ), *S(*-)» <P 3+ , <P3-)II(A + , A_) = 11 1 Qk±(-' » - » 1 \R(<Pi+> <Pi-> <P2+> <P2-)n(A+, A_)| vl=t = <P2± =«, 3± ± I \ d<pi± d<PtJi Obviously the differential operators (91) do not depend on A + and A- One can show then that this way we obtain a semicovariant Wilson expansion in so far as the degree of singularity of the semicovariant kernels decreases with k + and fc_. But locality of the operators (93) is not yet guaranteed. Finally we multiply both sides of the expansion for R with the singular factor and get a semicovariant Wilson expansion for the operator product itself. It ought to be mentioned that in certain degenerate cases, namely if discrete series representations occur, the series may be truly covariant termwise (Riihl and Yunn, 1975b; Swieca, 1974; Schroer and Swieca, 1975). This is due to the fact that one of the components in (85) drops out and covariance is restored this way. A famous degenerate case of this kind is obtained if the expansion operates on the vacuum state. The discrete series representations appear then as a consequence of the spectrum condition (Section 2). 5. PROBLEMS WITH LOCALITY The Thirring model is particularly simple in several respects. Firstly we know that the field operators (53) and all those local operators derived from Riihl and Yunn 343 regularized multilocal products of it, have a certain simple behaviour under commutation with (?± (see (54)) and therefore with §Q|. Thus projecting a product A(x)B(y) on simultaneous eigenspaces of T+ and T- A(x)B(y)U(X + ,XJ) (94) fixes both pairs t a± and r B± . Secondly we can regularize any product of such operators by splitting off a singular covariant factor, leaving a covariant bilocal C°° operator. Any such operator in M 2 leads to a series (a 'family') of operators with increasing dimension rf = -/ + -/- + l which for *f(fc±), gives j± = —J3(k ± ), d(k+, kJ) = di + d 2 +k+ + k- (95) (96) i.e. the dimensions increase in integral steps. In fact, our asymptotic complete- ness relation is a reordered Taylor expansion and each derivation enhances the dimension by one (within the context of a Weyl symmetry). It is known from perturbation theory with respect to e in a model in a space-time of 6-e dimensions, that this set of operators is too small for a Wilson expansion in general (Mack, 1973). Both properties of the Thirring model are not quite independent. If an operator product can be covariantly regularized as in (69), then this can make sense only if projections on eigenspaces from the right and the left fixes the transformation property of the regularized operator in both variables. In turn the transformation properties of the factors in the unregularized product must then also be fixed, namely in n(A^AL)AOt)fl(y)II(A + ,A_) = I n(A^, Al)A(x)II(A" + , A"_)fl(y)II(A + , A_) (97) all A'L must be equal modulo one. We study the problem of recombining the different projections to a local operator first and then try iq gain an idea of how the general case might look. In the Thirring model n(A^, X'.)A(x)B(ym\ + , A_) = s(x, yMK, A')* (*, y)II(A + , A_) (98) Al is either related to A ± by a function depending on the operators A, B AL = /t ± ,A. B (A ± ) (99) or both sides of (98) vanish identically. Moreover t a± and t b± are fixed and satisfy TA ± +T B± = AL-A ± modl (100) 344 Uncertainty Principle and Foundations of Quantum Mechanics Extracting the singular factor s(x, y) changes the transformation property of A and Bj A± , r A ±, j B± , r B± into that of R: j' A± , t' a± ,j' b ±, t b± , explicitly and TB± — T B± ~~ t± Ja± = Ja± + K ± (101) 7b ± =7b ± + ^ (102) with some real parameters t ± and k ± . d x and d 2 in (96) refer to the transforma- tion properties of the regularized operator and ought to be primed, too. We write the decomposition of the regularized operator in the form (xa = Xa+xXa-, etc.) Il(\' + ,\'-)R(x,y)U{X + ,\-) - I \ dz dz'Q(xA,x;x' B Mlc±),z)S(x(k ± ),z\x c (k ± ),z') xIl(\' + ,y-)0(x c (k ± ),z')IK\ + ,\-) (103) As emphasized in the preceding section, the operators 0(x c (k ± ),z) have a meaning without the projection operators applied to them, since the differen- tial operator, by which it is obtained from R(x, y), is independent of the eigenvalues A' ± and A ± . It can be shown by explicit calculation that the semicovariant kernel Q in (103) (but not Q la !) depends at least in a neighbour- hood of x = y only on T A± +TB ± ^r A± +T B± ^Al-A ± modl (104) Of course the same holds true for the intertwining operator S. In the Thirring model A i - A ± is in general a function of A ± . In some cases, however, namely if the operator product commutes with <? ± , this difference vanishes. Whenever the difference is a unique value modulo one, the summation over A± can be performed in (103) and it results in an expansion for the operator R in terms of local operators 0(x c (k ± ), z). In general the summation cannot be performed this way over the spectrum of T+ and T_. Then the non-local components of the operators 0(x c {k ± ), z) cannot be recombined to a local operator. An alterna- tive formulation makes the kernel Q an operator such that the components of 0(x c (k±), z) combine to a local operator. The non-locality is then carried over to the kernel. Finally we want to consider what happens to the kernel Q if we multiply it with the singular function s(x, y) (98). If we insert Q depends (besides the coordinates) solely on the parameters /i ± , j' 2 ±, k ± , and ta± + t b± . After multiplication with s we obtain an analogous function depend- Riihl and Yunn 345 ing in the same fashion on the parameters j l± , / 2± , fc± + 2/c ± , and t a± + t b± , except possibly an unessential k -dependent change in the normalization. The asymptotic completeness relation (88) has been derived for C°° func- tions. It can be generalized to other classes of functions exhibiting singularities. This can be achieved by appropriate regularization techniques. Splitting off a singular covariant factor as in the Thirring model is just one method for just one class of functions. For this class of functions we have learnt that the whole family of local operators gets shifted in the dimension by a fixed amount and that the semicovariant kernels depend solely on the combinations A i-A ± . Crucial to any model in M 2 is therefore the regularization of the individual terms in the sum I n(A^, XL)A(x)Il(X" + , \"-)B(y)U(X + , A_) (106) If all these terms can be regularized by extraction of the same covariant singular factor, we are in the same position as in the case of the Thirring model. The dependence on X" ± drops out completely and we can sum over it in a trivial fashion. A slightly more general situation arises if the individual terms in (106) can be expanded in a series each term of which can be covariantly regularized by factorization N 'Zs n (x,y)n(X' + ,\'J)R M n=0 (*,y)II(A + ,A_) (107) One could call such expansion a 'covariant pre-Wilson-expansion'. It is assumed that the singular factors s n are independent of A"±, and that the degree of the singularity of s n decreases with increasing n. Each term in the pre- Wilson-expansion yields a family of operators O nX+ s_(x c n (kJ, z) The intermediary projection operators can be eliminated by summation over X" ± . Concerning the recombination to local operators we are still in the same position as in the case of the Thirring model. If, however, in the pre-Wilson-expansion, the singular factors depend on X" ± in a non-trivial fashion, we have a new source for non-local operators appear- ing in the Wilson expansion. In any case, a pre-Wilson expansion (107) always leads to families of operators whose dimensions are non-integrally separated. Of course, it is also conceivable that no families of operators occur at all though this seems rather natural to us. REFERENCES Bateman, H. (1910) Proc. London Math. Soc., 8, 223. Bonora, L., Cicciariello, S., Sartori, G. and Tonin, M. (1973) Scale and Conformal Symmetry in Hadron Physics, Proc. Advanced School of Physics Frascati 1972, R. Gatto (Ed.), John Wiley, New York. 346 Uncertainty Principle and Foundations oi Quantum Mechanics Brandt, R. A. and Preparata, G. (1971) Nucl. Phys., B27, 541. Cunningham, E. (1909) Proc. London Math. Soc., 8, 77. Dell' Antonio, G. F., Frishman, Y. and Zwanziger, D. (1972) Phys. Rev., D6, 988. Dobrev V , Mack, G, Petkova, V. and Todorov, I. T. (1975a) JINR Report B2-7977 ; Elementary Representations and Intertwining Operators for the Generalized Lorentz Group, Institute for Advanced Study Preprint, Princeton. Dobrev V Petkova, V., Petrova, S. and Todorov, I. T. (1975b) Dynamical Derivation of Vacuum Operator Product Expansion in Euclidean Conformal Quantum Field Theory, Institute for Advanced Study Preprint, Princeton. Efremov, A. V. (1968) A model of Lie fields, Preprint P2-3731, JINR Dubna. Ferrara, S., Gatto, R. and Grillo, A. F. (1973) Springer Tracts in Modem Physics, Vol. 67, Springer- Verlag, Berlin, p. 1. Frishman, Y. (1971) Ann. Phys. (N.Y.), 66, 373. Fulton, T., Rohrlich, F. and Witten, L. (1962) Rev. Mod. Phys., 34, 442. Gell-Mann, M. and Low, F. (1954) Phys. Rev., 95, 1300. Hortacsu, M., Seiler, R. and Schroer, B. (1972) Phys. Rev., US, 2519. Jost, R. ( 1 96 1) 'Properties of Wightman functions', in Lectures on Field Theory and the Many -Body Problem, E. R. Caianello (Ed.), Academic Press, New York. Kastrup, H. A. (1962) Ann. Phys. (Leipzig), 7, 388. Kastrup, H. A. (1964) Nucl. Phys., 58, 561. . . Klaiber, B. (1968) in Quantum Theory and Statistical Physics, Lectures in Theoretical Physics, Vol. X-A, A. O. Barut and W. E. Brittin (Eds.), Gordon and Breach, New York, p. 141. Kupsch, J., Ruhl, W. and Yunn, B. C. (1975) Ann. Phys. (N.Y.), 89, 115. Lowenstein, J. H. and Swieca, J. A. (1971) Ann. Phys. (NY.) 68, 172. Luscher, M. and Mack, G. (1975) Comm. Math. Phys., 41, 203. ,,,<„„,„..., * . Mack, G. (1973) in Strong Interaction Physics, Lecture Notes in Physics, Vol. 17, W. Ruhl and A. Vancura (Eds.), Springer- Verlag, Berlin, p. 300. Mack, G. (1974) in Renormalization and Invariance in Quantum Field Theory, E. R. Caianello (Ed.), Plenum Press, New York, p. 123. „„„ „, ... n ■.: Mack G (1975) All Unitary Ray Representations of the Conformal Group SU(2, 2) with Positive Energy, Universitat Hamburg Preprint 1975, see this paper for the latest list of references on representations of the conformal group. Mack, G. and Salam, A. (1969) Ann. Phys. (NY.), 53, 174, and references cited there. Mack, G. and Todorov, I. T. (1973) Phys. Rev., D8, 1764. Mayer, D. H. (1974) Conformal Invariant Causal Structures on Pseudo-Riemanman Manifolds, Preprint Technische Hochschule Aachen, April 1974. Migdal, A. A. (1971) Phys. Letters, 37B, 386. Pohlmeyer, K. (1969) Comm. Math. Phys., 12, 204. Polyakov, A. M. (1969) Sov. Phys. JETP, 28, 533. Polyakov, A. M., (1973) Non-Hamiltonian Approach to the Quantum Field Theory at Small Distances, Preprint Landau Institute for Theoretical Physics, Chernogolovka 1973. Ruhl, W. (1969) The Lorentz Group and Harmonic Analysis, Benjamin, New York; see the references in this book for references to Toller's work. Ruhl, W. (1973) Comm. Math. Phys., 30, 287, 34, 149. Ruhl, W. (1975) Acta Physica Austriaca, Suppl. XTV, 643. Ruhl, W. and Yunn, B. C. (1975a) Representations of the Universal Covering Group of SU(\, 1) and Their Bilinear and Trilinear Invariant Forms, Preprint Universitat Kaiserslautern, June 1975, to appear in /. Math. Phys. Ruhl, W. and Yunn, B. C. (1975b) Operator Product Expansions in Conformally Covanant Quantum Field Theory, Part I: Strictly Covariant Expansions, Preprint Universitat Kaisers- lautern, October 1975. Ruhl, W. and Yunn, B. C. (1975c) Operator Product Expansions in Conformally Covanant Quantum Field Theory, Part II: Semicovariant Expansions, Preprint Universitat Kaiserslautern, November 1975. Schroer, B. and Swieca, A. (1974) Phys. Rev., D10, 480. Schroer, B., Swieca, J. A. and Volkel, A. H. (1975) Phys. Rev., Dll, 1509. Segal, I. (1971) Bull. Am. Math. Soc., 77, 958. . . Swieca, J. A. (1974) Conformal Operator Product Expansions in the Minkowski Region, Pontmcia Universidade Catolica do Rio de Janeiro, Preprint, May 1974. Ruhl and Yunn 347 Thirring, W. (1958) Ann. Phys. (NY), 3, 91. Wess, J. (1960) Nuovo Cimento, 18, 1086. Wilson, K. G. (1969) Phys. Rev., 179, 1499. Wilson, K. and Zimmermann, W. Comm. Math. Phys. (1972) 24, 871. Zeeman, E. C. (1964) /. Math. Phys., 5, 490. Zimmermann, W. (1970) Lectures on Elementary Particles and Quantum Field Theory, Brandeis University Summer Institute in Theoretical Physics 1970, Vol. 1, S. Deser, M. Grisaru and H. Pendleton (Eds.), MIT Press, Cambridge Mass., p. 395. Zimmermann, W. (1973) 'Operator product expansions', in Strong Interaction Physics, Lecture Notes in Physics, Vol. 17, W. Ruhl and A. Vancura (Eds.), Springer- Verlag, Berlin, p. 343. The Construction of Quantum Field Theories LUDWIG STREIT Universitat Bielefeld, Bielefeld, West Germany 1. THE PROBLEM Heisenberg's uncertainty relations are the most compact formulation of the two-fold challenge presented by modern physics. As efforts failed to dispute their fundamental nature (Einstein's 'God does not throw dice' [Heisenberg (1969) gives a vivid eye-witness account of this struggle]) natural philosophy was called upon to cope with a radically new way of thinking (Heisenberg, 1960). Mathematical physics on the other hand was faced with the task of enlarging the structures of classical theory in such a way that the uncertainty relations would find a place in them, or more precisely to base a consistent theory of quantum mechanics on the uncertainty relations, with classical mechanics as a macroscopic limit. The philosophical revolution has not come to a close in the past 50 years, we shall not deal with it here. The physicist is reassured by the fact that the other, theoretical, challenge has been dealt with quite successfully. In the past 50 years quantum mechanics has become well established as a physically relevant and mathematically consistent theory [cf . for example Mackey's book (Mac- key, 1963) for an axiomatic development of quantum theory on the basis of the uncertainty principle]. But not all is well. Einstein's theory of relativity, some 20 years before the advent of quantum mechanics even, amounted to yet another transgression beyond the domain of classical physics. At first glance these new territories appear to be quite disjoint: quantum theory takes the place of classical mechanics in the submicroscopic domain of atoms and nuclei, while special relativity does so in the realm of high velocities, comparable to that of light. However, in any attempt to formulate a theory of elementary particles it is the uncertainty principle itself which points to the necessity of an amalgam between quantum mechanics and special relativity. If we insert subnuclear masses and dimensions in AxAp^h 349 350 Uncertainty Principle and Foundations of Quantum Mechanics we find a velocity range Ap/m which is by no means small compared to the speed of light. And yet in these past 50 years and in spite of the hard work of what are now generations of physicists the construction of a relativistic quantum theory of interacting particles has not come to a close. Here we are faced with a problem that has turned out to be much more tenacious than its non-relativistic counterpart. Two questions come to mind: 'Why not give up?' and if this can be answered to satisfaction, then 'Why is it so hard?' To answer the first one need only observe that we are dealing with two theories— quantum mechanics and special relativity— which are undoubtedly appropriate and powerful where only one and not both extensions of classical physics are called for, i.e. for submicroscopic phenomena as long as one may neglect the relativistic aspect and, respectively, for relativistic phenomena as long as quantum effects are unimportant. A fundamental theory — of elemen- tary particles, if this concept should indeed survive— must deal with phenomena which are at the same time submicroscopic and relativistic, hence the quest for such a unified relativistic quantum theory is tantamount to the search for a fundamental theory of matter. Now why is this so hard that it has defied the efforts of so many?— To clarify this a few generalities concerning the physical 'ansatz' are in order. Evidently different attacks on the problem have been based on different sets of assump- tions, and one has frequently criticized the following list for being overly conservative, until the recent successes of constructive quantum field theory gave indications that they are reasonable. As in the non-relativistic theory one assumes a Hilbert space description of the physical system with the (pure) states represented by unit vectors and the observables by a non-commutative algebra of operators. A relativistic space-time structure is introduced if one considers local observables, i.e. operators with a space-time label, which (1). transform covariantly under a suitable unitary representation of the Poincare group with a unique invariant state (the vacuum). (2). commute if they are affiliated with space-like regions of space- time. More specifically one can consider the algebras of (e.g. bounded) observables that are associated with given space-time regions [see Araki (1969) for a review of this 'algebraic approach']. For all its mathematical advantages this framework has not permitted the formulation of a dynamical ansatz. If on the other hand one tries to use classical relativistic dynamics as, for example, given by the equations of motion of electrodynamics as a guideline for the construc- tion of a quantized relativistic theory one is immediately confronted with the Streit 351 concept of operators labelled by space-time points, i.e. to local relativistic quantum fields. Apart from deviations often dictated by frustration the construction of such fields has been the goal of many elementary particle theorists for some 40 years. How would one go about this? As early as 1936 Heisenberg discussed the relevance of classical non-linear field theory for elementary particle physics (Heisenberg, 1939). But almost all of the pertinent research since then has followed a different course. A systematic understanding of non-linear wave equations is only now beginning to emerge (cf ., for example, Reed, 1976), and as recent are most efforts to base quantization on their solutions (Dashen, 1974). Instead, for lack of more powerful methods, non-interacting 'free' fields — the construction of which poses no insurmountable problems — were taken as a starting-point. Interaction terms modelled after those of the classical theories were then added to the free Hamiltonian in the hope that it might be possible to treat the resulting dynamical changes perturbatively. This program quickly ran into problems of convergence, ill-defined divergent expressions, etc. 'Subtraction physics' evolved first as the art of dropping infinite terms to obtain a finite remainder, later in a systematic way as renormalization theory. While this allowed precise predictions at least for quantized electrodynamics the questions of existence remained open. This was — and is — particularly serious for nuclear forces since they are so strong that the perturbative approximations must also fail. Before embarking on a more detailed discussion of Hamiltonian quantum field theory it is worth while to pause and — with a good portion of hindsight — to isolate and exhibit the sources of these difficulties. (A) The Infinite Volume Divergence Addition of the space integral of an interaction energy density to the free Hamiltonian H = H +\ hi (x)dx gives rise to an operator which cannot be finite when applied to the vacuum ft, since H il = so that ||//n|| 2 =|(n,Mx)/t I (y)n)dxd y with the integrand depending only on x— y because of translation invariance. A more refined argument leads to 'Haag's theorem' [Haag (1955), for a very general proof cf., for example, Emch (1972)] which says that the canonical 352 Uncertainty Principle and Foundations of Quantum Mechanics variables of the problem with interaction cannot be equivalent to those appropriate for the free Hamiltonian. Note that 'all' representations of the canonical commutation relations [qi,q k ] = [Pi,Pk] = 0, [qi,p k ] = M8 ik i,k = l,...,n are unitarily equivalent (up to multiplicity and under reasonable technical assumptions). This important theorem of von Neumann (see, for example, Putnam, 1967) assures us that, for example, Heisenberg's matrix representa- tion of these operators will never produce results that are different from those calculated in, say, Schrodinger's framework, where <?,=*, Pk = -M— inL 2 (d"x) dXk In particular, any dynamical problem of quantum mechanics can be stated in these terms and solved by applying to an 'initial data' function from L the unitary group generated by the Hamiltonian. Not so in quantum field theory: von Neumann's uniqueness theorem breaks down as the number of degrees of freedom becomes infinite. There is then a vast and largely unexplored set of inequivalent representations for the canoni- cal algebra, and Haag's theorem tells us that we have to find a non-standard one appropriate for the given Hamiltonian — not even the canonical algebras for free fields of two different masses are equivalent. We mention in passing that the situation is somewhat different if we formulate the initial value problem not on a space-like hyperplane of space-time like, for example, {(x, 0:' = 0} but on a light-like one such as {(*,t):x, + t = 0} But at this point little is known about the adequacy of such data for non-trivial theories cf., for example, Leutwyler and coworkers (1970) or for some recent results and further references Driessler (in press). That is we have to deal with the paradoxical situation that to state the initial values has become a non- trivial, dynamical question, and we have to solve it before we can even formulate the dynamical problem correctly. This discouraging paradoxon was bound to influence the directions of research. The decade from 1955 to 1965 was characterized by the strategy to learn about field theories not by construc- tion but by postulating their existence and fundamental properties (locality, relativistic covariance, energy-momentan spectrum) as in the 'axiomatic' formulations of Lehmann, Symanzik and Zimmermann, of Wightman and — for local rings of bounded observables — of Araki, Haag and Kastler (cf., for example, Jost, 1965; Streater and Wightman, 1964; Emch, 1972). The con- structive problem was generally relegated until after the advent of some 'totally new creative idea, a further essential change in our conceptions of the struc- tural laws of matter' as one author put it in 1956. It was a fascinating episode of Streit 353 recent science history to observe how, ten years later, this turned out not to be the case. But let us first turn to the other obstacles that field quantization had to cope with. (B) The Ultraviolet Divergences There were expectations in the early days of quantum field theory that singularities such as the infinite self -energy of classical point charges would go away through quantization. But quite to the' contrary virtually every second calculation of quantum electrodynamics included the process of throwing away an infinite term and interpreting the remainder as the 'correct result'. These procedures were formalized in the renormalization theory of Feynman, Dyson and Schwinger. Used as a recipe for calculations they allowed for the astound- ing numerical predictions of quantum electrodynamics while at the same time the meaning of the formal dynamical ansatz or the formulation of a meaningful one was further obscured. (C) Series Divergences The successes of renormalized perturbation theory as applied to quantum electrodynamics are even more impressive in the face of yet another type of divergence — series divergences. For non-linear interactions the convergence question of the perturbation series for, say, the Green's functions looks hopeless. Combinatorial considerations show a veritable explosion of the number of terms as the order of the perturbation increases. Also a glance at, for example, the quartic oscillator potential makes plausible that inverting the sign of the coupling constant changes the nature of the interaction so drastically that we should not expect analyticity in the neighbourhood of zero. (D) Infrared Divergences With these we come to the end of our list of difficulties. They arise from the long range of forces mediated by the exchange of massless particles. In momentum space these long distance problems become problems of small momenta (hence the name). Certain aspects can be studied in non-relativistic models: note the discussion of scattering from potentials with Coulomb tails. Also, in contradis- tinction to the other complications, this one is not intrinsic to all non-trivial local quantum field theories. It does not arise as long as we focus on theories without massless excitations, and we shall not consider it further. As we turn to an account of recent progress in constructive quantum field theory we shall aim neither for mathematical rigour nor for any kind of 354 Uncertainty Principle and Foundations of Quantum Mechanics completeness— this would be quite meaningless anyway in a situation of such rapid progress— but instead we shall try to communicate to the non-expert how the main structural problems are being tackled and what one can say about the evolving theory. (The references were selected correspondingly.) II. HAMILTONIAN QUANTUM FIELD THEORY A systematic exposition of the subject may be found in various texts (e.g., Schweber, 1961; Bjorken and Drell, 1965) we shall content ourselves here to present the most important concepts as generalizations of ones that are well-known from non-relativistic quantum mechanics to the case of infinitely many degrees of freedom. We shall begin our short dictionary of quantum field theory language with the canonical variables which in field theory are indexed by points in s- dimensional space the expressions on the left-hand side refer to quantum mechanics and those on the right-hand side to quantum field theory in this and the following examples. [ qi ,p K ] = iS Ki [<p(x),*(y)] = i8 (s X*-y) Generic variables are obtained as follows (<?, A) = I kfi t A e R- <p(f) = \ /(*)?« d S % Sfe 1 = 1 This 'smearing out' with smooth, rapidly decreasing functions has the extra advantage of making <p(f) a less singular operator than <p(x) is. The equations of motion are q K = i[H,q K ] = p K q K = i[H, p K ] < p(x) = i[H,<p{x)] = 7r(x) <p(x) = i[H,Tr(x)] For the vacuum state we borrow a typical property of quantum mechanical ground states: for almost all x tfi cyclic for the field algebra s&„ i.e. V^o dense in the representation space Such cyclic representations allow for a very compact and useful description via E(X) = (*„, e l( ** Vo) E[f] = (%, e'^o) = f e^lMxtfd-x Jw» n = [ e ,(x/) dfiix) Streit 355 E is the 'characteristic function' ( = E is the characteristic functional of a Fourier transform) of a probability probability measure /a on the vector measure |i^o(*)| 2 d";c on the vector space &" of distributions dual to the space {x} = R" dual to the A. space Sf (cf., e.g., Gelfand and Vilenkin, 1964; Hida, 1970). A prominent example is furnished by the harmonic oscillator ground state and its field theoretical counterpart: K = kp,p)+kq,o> 2 q)-E #osc</'0 = so that in this case E(X) = e -l/4(A,w-U) _ e -l/2||(9,A)*olP // = ljd s X :7r 2 (x) + (V,p( X )) 2 = 2 d x : ir 4- (pw <p : £[/] = e - 1/4(/ ' a '" 1/) - _ e -l/2||*.(/)*-<JI 2 In both cases we are dealing with multivariate Gaussian distributions of mean zero: their characteristic function(al)s are obtained by exponentiating their second moments. With 2-U/2, . that (/, Co' 1 /) = f /»(*)-=!=/(*) d s k J ylK +m <o = (-A + my* so E[f] is the generating functional of the Fock representation of a scalar relativistic free field of mass m. <p(x)+co 2 <p(x) = Creation and annihilation operators are introduced through <?*=" l2(Olr ■-(at + a k ) Pk = i\J—(a k - a k ) flfciAo-0 [«fc, at'] = S kk , < P {x)=^r s/2 \ 1 £=y{k)^ } y/2(o(K) + a(k)e iKx ) „(x) = HIttY 5 ' 2 \ d s k^f-{a + {k) e-** ) -a(k)e- iKx ) a(k)V o = w(fc)=VF+m^ [a(k),a + (k')] = S is \k-k') In both cases the double dots : : of Wick ordering signify ordering field operator products such that all annihilation operators a stand to the right of the creation operators a + . 356 Uncertainty Principle and Foundations of Quantum Mechanics In particular this procedure makes :<p"(x): a well-defined local operator in the sense that :?":(/)= j/(x):<P n «:d s * is densely defined or even self -adjoint for suitable n and s. As a consequence g/t,(x) = g:<p"(x): n>2 has become the classical ansatz for the interaction energy density of a self- coupled scalar field. In the following we shall concentrate on models of this t yp e _ w hile self -interacting scalar fields may not be appropriate as a funda- mental concept for particle physics, they provide the simplest model for the discussion of the basic mathematical problems inherent in any non-trivial field theoretical ansatz. III. CUTOFFS: THE GUENIN^SEGAL STRATEGY Cutoffs come to mind as a remedy of the basic problems: 'Putting the theory into a finite box' to avoid the infinite volume divergence, setting finite upper limits for momentum space integrals to eliminate ultraviolet divergences— these techniques have been employed from the early days of relativistic quantum field theory. From the theoretical point of view any such surgery amounts to a violation of basic symmetries and principles such as translation invariance or locality, from the practical point of view it was often a matter of luck or intuition to extract just those quantities which were not violently cutoff-dependent, or otherwise to find a (more or less) physical interpretation of the cutoff. It is a major and characteristic achievement of constructive quantum field theory that one has learned to make the cutoffs reversible, by first introducing sufficiently many of them to be able to construct a well-defined model and then controlling the limits as the cutoffs are removed in such a way that a non-trivial relativistic quantum field theory emerges. Evidently on the practical side much was learned about which quantities do not depend catastrophically on cutoffs and hence are amenable to approximate computation. The Guenin-Segal strategy [reviewed by Jaffe (1969)] presents the most transparent example of such reversible surgery. Its goal is to circumvent Haag's theorem, i.e. to deal with the infinite volume divergence of interaction Hamil- tonians such as H = H +\g:<p n (x):d s and it is based on one cutoff and two observations. The cutoff is rather obvious. Since the infinite integral over the interaction energy density causes the problem we reduce the latter to zero at large Streit 357 distances through a space-dependent coupling g(x) fg>0 |x|</ g(x) = |x|>/ + £ with a smooth transition between the regions of constant coupling strength |x| < / and of no interaction |x| > / + e. We denote the modified Hamiltonian by H, The two observations that bring this cutoff under control exploit the locality of the interaction term and the continuity of vacuum expectation values. (A) Locality The equation of motion <p(x) = i[H,, ir(x)] is insensitive to the values of g(y) for y # x since n(x) commutes with the energy density at such points. Causal propagation of the field [a feature to be verified! (Jaffe, 1969, p. 126ff.)] then allows one to conclude that the time evolution of the field is insensitive to the cutoff in the causal dependence region of the constant coupling domain |x| < /, i.e. we have a cutoff independent solution in the diamond in which we can imbed any bounded space-time region by choosing a suffi- ciently large, yet finite, cutoff parameter one. But this solution of the equations of motion is not all that is required. For the construction of physical states we next invoke the following. (B) Continuity of the Vacuum Expectation Values Looking for a physical vacuum which, formally, should be given to us as the lowest lying eigenstate of H, we run into the following problem. Consider the ground states [it is by no means trivial to verify their existence (Glimm and Jaffe, 1970)] ft, of the approximate Hamiltonians //,: on the basis of Haag's theorem we cannot hope for ft, to have a non-trivial limit as the cutoff / is taken to infinity. However there is a subtle distinction between convergence of the vectors ft, in Fock space and that of the expectation functionals «>,(A) = (ft„Aft,) on the field algebra generated by the approximate vacua. The following heuristic argument supports this distinction: as the cutoff parameter / is increased, the ground state differs from the Fock vacuum (and all other Fock space vectors) over larger and larger regions until in the limit it becomes 358 Uncertainty Principle and Foundations of Quantum Mechanics orthogonal to all of them ['van Hove's phenomenon', Guerra (1972)] w - lim ft/ = /H.CO On the other hand it is plausible that the expectation value of local observables A changes only little if the state in question is altered at distances of the order of a large /, and less and less as / approaches infinity: lim o),(A) = (o(A) is expected to exist. There is then a well-known procedure ('GNS construction'*) to cast <o (A ) m a Hilbert space form: w(A) = (ft,Aft) At first sight this may be confusing. Have we found a vector ft where there was none before? This is not the case. Recall that we have found it impossible to construct the limiting vector ft in Fock space. Here it occurs as a cyclic vector of a field which is inequivalent to that of the Fock representation. One might say that by controlling the limiting state we have succeeded in constructing the theory. What then remains to be done is to verify the required properties, such as Poincare invariance [for Lorentz covariance in Fock space cf . Cannon and Jaffe (1970) and for a 'Euclidean' proof Simon (1974)] and the desirable ones, like the existence of particles (Glimm and coworkers, 1974) and of non-trivial scattering processes (Osterwalder and Seneor, 1975); Eckmann and Dimock (in press) between them. We should emphasize that this construction actually predicts the particles of the theory— the resulting representation of the Poincare group will not be equivalent to the original one in Fock space. In this sense, too, a relativistic quantum theory provides a more fundamental description of matter. The program that we have sketched for the construction of such theories includes many steps which we have barely mentioned here although they are technically very involved. It is a tremendously important step forward in the construction of a relativistic quantum theory of matter though that this program has been proven viable — if only for a sufficiently simplified class of models. It will turn out to be very instructive for us to track down the cause of such restrictions. Recall that we had cutoff the interaction Hamiltonian in an effort to obtain a finite vector when applying it to the Fock vacuum: IWo||<oo If we express, for example, an interaction energy density h l (x) = g(x):<p n (x): *For this 'reconstruction' of fields resp. bounded observables cf ., for example, Jost (1965), Streater and Wightman (1964) and Emch (1972). Streit 359 in terms of the creation and annihilation operators given in our 'dictionary' it is straightforward to calculate \\H^ \\ 2 = const, f ft-^-gfek,,) J v = \ W(K„) V v I Here g denotes the Fourier transform of the cutoff function g. Whatever its exact form may be the integral is finite only in a model world where the k- integration — and hence space — is one-dimensional. With increasing space- time dimensionality (and increasing interaction power n) the integral exhibits a higher and higher degree of divergence for large k — i.e. an 'ultraviolet' divergence that calls for renormalizations. We have found that in such cases the space cutoff Hamiltonian may not see the Fock vacuum, technically the latter is not in the domain of H,. Nor, as one can check, are any other simple Fock space vectors that one might think of (Glimm (1969)). For such singular perturbations the domains of the Hamiltonians (the vectors of finite energy) are sensitive to the detailed features of the interaction, its specific form would have to be taken into account in their construction. This particular problem can be attacked with the help of approximate M0ller operators. In non-relativistic quantum mechanics these serve to intertwine between the free and the interacting Schrodinger Hamiltonians, and conse- quently, between their domains. A viable adaptation of these ideas to the case at hand proceeds along the following steps: introduce a high momentum cutoff in the interaction Hamiltonian to make it well-defined — use Friedrichs' pertur- bative construction [for a review and references cf. Streit (1970)] to obtain approximate wave operators ('dressing transformations') — apply these 'dres- sing transformations' to suitable Fock space vectors to obtain state vectors that the interacting Hamiltonian can see — remove cutoffs to obtain states appro- priate for the full, no cutoff interaction. Technically the construction of such approximate dressing transformation and controlling the limits is extremely complicated, but two structurally interesting observations should be made before we embark on a more recent alternate approach. One can only hope to find intertwining transformations for operators with matching spectra. Friedrichs' construction actually generates these adjustments of the ground-state energy, mass gap, etc. These are the so-called renormalization counter-terms. In the limit as the cutoff is removed they would become infinite but as they cancel corresponding infinite ground- state energies, masses, etc., in the original Hamiltonian the overall renor- malized energy operator has a finite limit. A particularly accessible subclass of models is formed by those where such asymptotically infinite renormalization terms occur only up to a finite perturbation theoretical order. These are the so-called superrenormalizable models, among them the 'Yukawa interaction' of fermions and bosons in two space-time dimensions (Y 2 ), and the quartic self-interactions of scalar mesons in a three-dimensional space-time (<p 3 ). At 360 Uncertainty Principle and Foundations of Quantum Mechanics present the problem of going beyond this class and of tackling models like <p$ in the physical four-dimensional space-time is still unsolved. Secondly two cases must be distinguished regarding the limiting 'dressed states' as the momentum cutoff is removed. In the less singular case the limits can be performed within Fock space. Otherwise one must proceed as with the infinite volume limit that we have discussed above and construct a new field representation from limits of expectation values. In this latter case then, the ultraviolet divergences alone already call for a non-Fock representation of the field. Prominent examples are the Y 2 and <p\ models, respectively [for a review cf. Hepp (1969)]. IV. EUCLIDEAN SPACE-TIME— AND BACK! It has been observed frequently in various contexts of non-relativistic as well as relativistic quantum dynamics that the transition to imaginary time results in remarkable structural simplifications: one obtains the heat equation from the Schrodinger equation, correspondingly Wiener integrals instead of Feynman's path integral, better behaved kernels in the Bethe-Salpeter equation for relativistic two-particle amplitudes, and most importantly for us here, the transition from relativistic to 'Euclidean' quantum field theory brought about by switching from relativistic Minkowski space-time to a space-(imaginary) time with positive definite Euclidean metric gives us models of equilibrium statistical mechanics (Symanzik, 1969) which we are comparatively much better equipped to handle. The central role that this latter transformation has recently begun to play in the development of relativistic quantum dynamics stems from the fact that Nelson (Simon, 1974, Chap. IV) and K. Osterwalder and R. Schrader (Simon, 1974, Chap. II) have given conditions under which it becomes reversible. In the light of this discovery it has become an advantageous, and very effective, approach to the construction of quantum field theories to first establish the corresponding Euclidean theories and as many of their properties as possible by means of methods borrowed from statistical mechanics, and finally to check that they survive the transition back to the relativistic Min- kowski space-time. As a recent example— among many others— we mention the work of (cf . the papers of Eckmann and Dimock, in press) on the existence of a non-trivial scattering matrix and its asymptotic series expansion. For in-depth reading on the 'Euclidean strategy' a monograph written by one of the leading experts in this field is available (Simon, 1974). In the present review we want to give an introductory sketch of the method and of its scope. To this end we need to introduce one more concept from quantum field theory, the time-ordered Green's functions. They are symmetric functions of n space-time arguments defined to equal the vacuum expectation value of the n -field time-ordered product of the field at space-time points Xi = (%i, JCoi) T n (x lt ...,x n ) = (il, <p(x n , . . . <p(x n )il) if Xqi >x 02 >x 03 ...>x 0n I Streit 361 and they are described most handily by their generating functional T[f] = S ~, ft f d S+1 X v f(x y ))Tn(Xu ...,*„) = (a,Te iivMfMds * lx -D) It is straightforward but very instructive to calculate this functional for the trivial case where <p is a free field of mass m > in Fock space so that it obeys the Klein-Gordon equation of motion (d^ + m 2 )cp(x) = Defining its Green's function by A F (x) (2ir) J -m +ie {Irrf 'J ■ p„p' one finds for the free-field functional t r[/] = r [/] = exp -f J dx dyf(x)A F (x ~ y)f(y) Continuation to imaginary times yields the functional o\f\. Minkowski inner products become Euclidean ones « 2 p p^-p so that, in terms of the Fourier transform / < r [/] = exp-|(/,(p 2 + m 2 )- 1 ^ a is the generating functional of the t -functions continued to imaginary time, the so-called Schwinger functions S„. Their interest lies in the fact that fairly explicit and very useful expressions for a and the Schwinger functions can also be derived for an interacting field. It will be our main task in this section to do so in a heuristic fashion. The necessary mathematical arguments are presented in Simon (1974), Chapter V; as examples of recent extensions to more singular models such as the Yukawa model Y 2 or the quartic meson self -interaction <p\ in three-dimensional space-time we quote McBryan (in press). The crucial observation is that — in contradistinction to T \_f\-adf] is the exponential of a negative definite quadratic form, i.e. just like the generating functional of a free field at fixed time we may write it as the Fourier transform of a (Gaussian) probability measure on the space of generalized functions: 0-oM = f e'<*' •^"(R s+1 ) d^oCr) Recall that— for finite dimensional vector spaces!— the Fourier transform of a Gaussian is again a Gaussian, with the inverse quadratic form in the exponent, i.e. formally d Mo (*) = const e~ 1/2( * 2 ' (p2+m2) * 2) d°° x = const e- 1 ' 2 * ds+1 * ( * 2+<v * )2+m2 * 2) d°°^ 362 Uncertainty Principle and Foundations of Quantum Mechanics Observe the space-time integral of the Hamiltonian density in the exponent, an extra factor Jf g e- Idxhgl<x) ought to generate the measure dfi g appropriate to the interaction mediated by the Hamiltonian H = //„ + } d s xg/i,(x; Now evidently the infinite volume element d 00 * above is purely formal but if we add the interaction factor to the left-hand side there is at least a fighting chance for to be well-defined since dft„ is. Jf g is just a normalization factor for the new measure: "?-[ -Jdxgh,(x) d/Ll Not unexpectedly there is a Euclidean variant of Haag's theorem in our way but by now we know how to deal with this: we cutoff the interaction strength by making g space-time dependent and let g-* const in the final expression. For our favourite model <p 2 where this leads to the probability measure -(d2xg(x):*.4(x): o>o[Af] d/u. [x] g-» const J c with the Schwinger function as its moments For these quantities then one must verify the 'Osterwalder-Schrader axioms' (Simon, 1970, Chap. II) which are Euclidean analogues to those of Wightman and which guarantee that a corresponding relativistic quantum field theory obeying the latter exists. The shorthand notation f-d/*o = <->o, f d 2 xh I (x) = U v makes the similarity with infinite volume Gibbs states of equilibrium statistical mechanics even more transparent: o-[/]= lim (e™ <r' u «)ol(<T' u °) . Streit 363 The following random collection of observations is meant to serve as an illustration — by no means an exhaustive one — of the wealth of information which this analogy opens up. (1). The coupling constant g plays the role of an inverse temperature. High temperature expansions as in statistical mechanics have turned out to be useful to deal with the weak coupling regime of model quantum field theories (Simon, 1974). (2). Physical masses, i.e. the lowest excitations of the system, can be discussed effectively in terms of inverse correlation lengths. (3). The direct coupling of the random field at different space-time points is brought about by the gradient term of the free Hamiltonian. In a lattice approximation to where the random field is replaced by a discrete set of 'spin' variables X t this coupling amounts to that of a nearest neighbour Ising f erromagnet. As a result various useful correlation inequalities can be proven for the Schwinger functions (Simon, 1974). (4). As the coupling strength is increased <p2-models exhibit phase transi- tions, long-range order, and the breaking of the <p -* -<p symmetry (Glimm and coworkers, 1975; cf. also Glimm and Jaffe, in press). The proof uses mean field techniques and the Peierls argument from statisti- cal mechanics. Here it becomes patent to what extent the Euclidean formulation has come into its own. (5). The existence of a non-trivial <p$ model has turned out to be closely related to the non-triviality of the four-dimensional Ising model at the critical point (Glimm and Jaffe, 1974; Schrader, 1975). With this glimpse of the final goal— namely to establish non-trivial relativistic quantum theories for interacting particles in four-dimensional space-time — we close this 'introductory review'. We have tried to display a representative subset of the techniques and the trends of a field that has recently seen rapid development. At this point there is good reason to be optimistic about the emergence of relevant models for the subnuclear structure and interaction of matter. With this goal in mind the impressive amount not just of abstract existence proofs, but beyond these of structural insight and of sound computa- tional techniques inherent in the recent development of constructive quantum field theory acquires a particular relevance. REFERENCES Axaki, H. (1969) in Local Quantum Theory, R. Jost (Ed.), Academic Press, New York. Bjorken, J. D. and Drell, S. D. (1965) Relativistic Quantum Melds, McGraw-Hill, New York. Cannion, J. T. and Jaffe, A. (1970) Comm. Math. Phys., 17, 261. Dashen, R., Hasslacher, B. and Neveu, A. (1974) Phys. Rev., D10, 4138. Eckmann, J. P. (in press) in Quantum Dynamics: Models and Mathematics, L. Streit (Ed.), Springer, Vienna. Dimock, J. (in press) in Quantum Dynamics: Models and Mathematics, L. Streit (Ed.), Springer, Vienna. 364 Uncertainty Principle and Foundations of Quantum Mechanics Driessler, W. (in press) 'On the structure of fields and algebras on null-planes I, II; Acta Phys. Austriaca. Emch, G. (1972) Algebraic Methods in Statistical Mechanics and Quantum Field Theory, John Wiley, New York. Gelfand, T. M. and Vilenkin, N. Ya. (1964) Generalized Functions, Vol. 4, Chap. IV, Academic Press, New York. Glimm, J. and Jaffe, A. (1970) Ann. Math., 91, 362. Glimm, J. and Jaffe, A. (1974) Phys. Rev. Lett., 33, 440. Glimm, J. and Jaffe, A. (in press) in Quantum Dynamics : Models and Mathematics, L. Streit (Ed.), Springer, Vienna. Glimm, J., Jaffe, A. and Spencer, T. (1974) Ann. Math., 100, 583. Glimm, J., Jaffe, A. and Spencer, T. (1975) Comm. Math. Phys., 45, 203. Guerra, F. (1972) Phys. Rev. Lett., 28, 1213. Haag, R. (1955) Dan. Mat: Fys. Medd., 29, no. 12. Heisenberg, W. (1939) Z. Physik, 113, 61. Heisenberg, W. (I960) Sprache und Wirklichkeit in der modemen Physik in Gestalt und Gedanke, Folge 6. Heisenberg, W. (1969) Der Teil und das Game, Chaps. 5-1 1,17, Piper, Munich. Hepp, K. (1969) Theorie de la Renormalisation, Springer, Berlin. Hida, T. (1970) Stationary Stochastic Processes, Section 4, Princeton University Press, Princeton. Jaffe, A. (1969) Local Quantum Theory, R? Jost (Ed.), Academic Press, New York. Jost, R. (1965) The General Theory of Quantized Fields, Amer. Math. Soc, Providence. Leutwyler, H., Klauder, J. R. and Streit, L. (1970) Nuovo Omenta, 66A, 536. Mackey, G. W. (1963) Mathematical Foundations of Quantum Mechanics, Benjamin, New York. McBryan, D. A. (in press) Quantum Dynamics : Models and Mathematics, L. Streit (Ed.), Springer, Vienna. Osterwalder, K. and Seneor, R. (1975) 'The scattering matrix is non-trivial for weakly coupled P(<p) 2 models'. Preprint. Putnam, C. R. (1967) Commutation Properties of Hilbert Space Operators and Related Topics, Springer, Berlin. Reed, M. (1976) Abstract Non-linear Wave Equations, Springer, Berlin. Schrader, R. (1975) 'A possible constructive approach to <p$ I, IP. Berlin preprints. Schweber, S. S. (1961) An Introduction to Relativistic Quantum Field Theory, Harper and Row, Evanston. Simon, B. (1974) The P(<p) 2 Euclidean (Quantum) Field Theory, Princeton University Press, Princeton. Streater, R. F. and Wightman, A. S. (1964) PCT, Spin and Statistics, and All That, Benjamin, New York. Streit, L. (1970) Acta Phys. Austriaca Suppl. VII, 355. Symanzik, K. (1969) in Local Quantum Theory, R. Jost (Ed.), Academic Press, New York. Classical Electromagnetic and Gravitational Field Theories as Limits of Massive Quantum Theories GORDON FELDMAN The Johns Hopkins University, Baltimore, Maryland, U.S.A. 1. INTRODUCTION The correspondence principle in quantum mechanics states, inter alia, that as Planck's constant h approaches zero the theory must approach the correspond- ing classical theory. This principle is meaningful if there exists a classical theory which corresponds to the particular quantum theory. If we examine quantum field theories we can apply the principle only to 'massless' theories, i.e. to field theories which on quantization describe particles of zero mass. One can see this in many ways. The simplest is to notice that a field equation involves derivatives of the field to which one must add a 'mass term'. Dimensional arguments require that this term be proportional to powers of mc/h (the inverse Compton wavelength, m being the mass). We see that taking the limit h -> with m kept fixed is completely different from the limit m -> and then h -> 0. Accordingly, a classical field theory of particles requires taking the m -* limit before the h -> limit, i.e. a classical field theory of particles, of necessity, describes massless particles. In fact as m -> the parameter h disappears from the field equations. Two familiar examples of classical field theories are the electromagnetic (Maxwell theory) and gravitational (Einstein theory) field theories. We can interpret the Maxwell field (or photon field) as a relativistic field describing particles of mass, zero, and spin, one. The Einstein (or gravitational field) is a relativistic field describing particles of zero mass and spin two. One might expect that these classical field theories may be the limit of quantum field theories which describe massive particles of spin one and two. This problem has attracted some attention recently (Boulware and Deser, 1972; van Dam and Veltman, 1970). One examines quantum field theories describing massive particles of spin one and two coupled to sources and then performs the m -» limit. The question is to discover whether this limit is smooth, i.e. does this limiting theory give rise to the same experimental consequences as the corres- ponding field theories describing massless particles of spin one and two coupled 365 366 Uncertainty Principle and Foundations of Quantum Mechanics to the same sources, respectively. That there may be some problems connected with the m -» limit we can see from the properties of the representations of the Poincare group. Those irreducible representations which span the space of massive particle states also have a spin parameter, s, with degeneracy 2s + 1, i e a particle of mass m and spin s has Is + 1 degrees of freedom. However, the irreducible representations corresponding to a massless particle also has a spin parameter s but only two degrees of freedom*. The implication of the above remarks is that the Hilbert space of physical states describing particles of mass m * and 5 ^ 1 is somehow larger than the Hilbert space for the corresponding massless particles. Since degrees of freedom cannot disappear, the resolution to the problem must be in the fact that either the m -* limits are not smooth (i e the two theories are different) or that the 'disappearing' degrees of freedom decouple or both. In this article we examine carefully the m ^ limits for spin one and spin two field theories to see what happens to the structure of the theories. Most of the results have been obtained previously by Boulware and Deser (1972) and by van Dam and Veltman (1970). What we do in this article is to approach the problem by using different techniques. In Section 2 we examine the equations of motion for spin one and two fields in order to see how the m *0 and m = equations each describe the correct number of degrees of freedom. In Section 3 we solve the equations in the presence of sources by finding the propagators. We again compare and contrast the solutions for the massive and massless cases in order to see if and why the massive solution approaches the massless case. In Section 4, we find those Lagrangians which lead to the required equations of motion. We also make use of the Lagrangian to find the commutation relations for the independent degrees of freedom, in order to see again if the 'disappearing' degrees of freedom do or do not have smooth limits. In the Appendices we outline some of the projection operator techniques used in the paper. 2. THE EQUATIONS OF MOTION In this section we discuss the equations of motion of massive and massless fields of spins one and two in the presence of external sources. We will demand ultimately that these equations be derivable from a Lagrangian. Therefore if the field for spin one is a vector field A M , its source ; M , is also a vector and if the field for spin two is a symmetric tensor field h^, its source T„. v , is also a symmetric tensor. We discuss two problems in this section; (a) how the equations of motion for a vector field with four components describe only three dynamical variables for mass m * and two dynamical variables for m = and (b) how the equations of motion for a symmetric tensor field with ten compo- nents describe only five dynamical variables f or m * and two dynamical variables for m = 0. *In addition to the operations of the Poincare group we include the spatial inversion (or parity) operation. Of course for spin 5=0 there is only one degree of freedom. Feldman 367 We can write down the equations for both spins uniformly by making use of the Levi-Civita tensor density e^ yAp with the usual antisymmetry properties*. Define the second order differential operator DP " 1 " ;== vfftrr -j -ja apk — £(jiapk£ 0„(f (1) where a„=- Bx" (2) We can write the equations for spin one and two by operating with D on either A^ or h^ and saturating enough indices so that the resulting tensor transforms like A^ or h^ v respectively. Thus for spin one we form 1 ~)Dap\Ap\X) and for spin two ■D&h'Jix) (3) (4) The mass term will be proportional to the A^ and /i M „ respectively. Accordingly for spin one we have jP%\A fi {x)-m 2 A a {x)=j a {x) and for spin twot ^D^hy < x) + m\hi(x)-aSlh{x))=n(x), (5) (6a) or YjZOJfr) + m\hl (x) - 5f h(x)) = 1*(x), (6) where and a, at present is arbitrary. h = K, (7) *We shall use units such that h = c = 1. The metric t)„„ has only the diagonal elements (1, -1, —1, -1), (i, v = 0, 1,2, 3, i',/ = l,2, 3 and e i23 = +1. tNote that we can write the equation for a spin zero field <f>(x) as J_ 3! since 1 -D%*4>-m 2 <l>=j 3! vatpA _ f>v 368 Uncertainty Principle and Foundations of Quantum Mechanics If we make use of the following identities 1 w3<rA _ c vfrr _ rsysPs? + Sf Sf g" + spX - sffifi - s;srX - Wffi) and 2!' e '**=-8'* = -(8~X- 8 i S 2 (8) (9) we see that equation (5) for m = is just Maxwell's equation and equation (6) is the Pauli-Fierz equation (Fierz and Pauli, 1939) for massive spin two. Now we must see how many dynamical variables appear in equations (5), (6) and (6a). Any component of A M or h^ say & and its time derivative drf can be assigned arbitrarily on some constant time surface, say r = 0. A dynamical variable will be those components of A M and fc MF which appear in the equations involving second time derivatives. If in equation (5) we set the index p -0, using (1) we find v * and thus A appears in the equations as a zeroth or first time derivative. We have then that only the A, are dynamical variables and that the equations of motion determine A in terms of the A,, We apparently have not used the fact that m * 0. However if m = 0, equation (5) is invariant under a set of transformations— gauge transformations* A^Ap+d^A (10) One may see this trivially by using (1) and the antisymmetry property of e . We can choose A (i.e. find a gauge) so that one of the apparent dynamical variables A, is identically zero. We are left with two dynamical variables for m = 0. Another property that follows immediately from (5) for m = is current conservation. That is, if we take d a of equation (5) we must have, for m - 0, a%=o (ID This is a consequence of the equations of motion— it is not a separate equation of motion. Again, it is proved trivially using the properties of e^^. For the case m ¥> 0, we may be able to choose sources such that (11) follows, but in this case (1 1) will be an additional equation of motion. We can now carry out the same procedure for the field /i M „. In equation (6a) if we set the index a = 0, and using (1), v * and therefore the four h% cannot be dynamical variables. We are left with the six h tj as possible dynamical variables. Consider, now, equation (6a) with a = /3 = 0. It reads F(b i d i h l k ) + m\{\-a)hl-ah i ^= T% (12) The derivative terms F are only spatial derivatives of the hy. Accordingly, if we choose .... a = l (I 3 ) •We assume of course that j„ is gauge invariant. Feldman 369 equation (12) is a constraint equation on the six h tj and we are left with five dynamical variables, the number required to describe a massive spin two field. The resulting equation, (6) is indeed the Pauli-Fierz equation (Fierz and Pauli, 1939). Let us now turn to the case m = for (6). Of course (12) with m = again reduces the six h tj to five independent variables. Again, as for spin one, equation (6) for m = is invariant under a set of transformations — gauge transformations (14) ' h^+d^A,, + d„A„ which one deduces trivially from the properties of D. We can choose A, such that three of the remaining five h t j are identically zero. This leaves us with two dynamical variables for m = 0, as required. Again it follows from the equations of motion that 3^ = (15) if m = 0. This leads to the well-known problem of the consistency of these equations, if we identify 7^ with the energy momentum tensor of matter and radiation and h? a with the gravitational field. This Tf cannot be conserved and we must add to it the energy momentum of the gravitational field itself which then leads to equations non-linear in the h„. We can use this technique to lead us to the full Einstein equations for the gravitational field. See Deser (1970) for references. In this work we shall restrict ourselves to the linearized version. In doing so we are in effect assuming that Tf is proportional to some small coupling constant / and that we work only to 0(f 2 ), in which case the matter and radiation energy momentum tensor will be conserved. 3. THE PROPAGATORS In this section we shall obtain the propagators for the classical fields by using the projection operator techniques outlined in Appendix A. It is conve- nient to take the Fourier transform of the equations of motion (5) and (6) and so work in momentum space. After this transformation and making use of the identities (8) and (9) we can write the equations of motion for A^(p) and h^ip) the Fourier transformed fields as follows: and where Ka»(p)=Up) K%h aP (p) = -T^(p) (16) (17) K^-DZ\{p)-m z 8 2 c x (p 2 -m 2 )8^-p^ (18) 370 Uncertainty Principle and Foundations of Quantum Mechanics and J-[{p 2 -m 2 )8 a X-{8lp v p^8ip vP a ) + ^p a p P +P,jvf ' where we have assumed h^ and T^ are symmetric tensors. If the tensors K have an inverse we can solve equations (16) and (17) for A^ and h^ respec- tively, to read A M = G V J V (19) and where in both cases i.e. and h^ v — (jpv 1 ap G = K~ l Kg:=si (20) (21) (22) (23) (24) 0^ = 2(S^+#) Since A„ is a four-vector field we can find G^ by writing Kl in terms of its spin one and"spin zero projection operators. These are easily found to be and where and we have and We can write p»=pj{p 2 f (p m f = p m pd)p(O) _ Q K" = [(p 2 -m 2 )P m -m 2 P«X (25) (26) (27) (28) (29) (30) (31) Feldman 371 The inverse of K follows immediately giving 1 _m 1 G -G />(!>_- -m m /„ p -m \ m I (32) Again this shows that we are discussing a massive theory of spin one since only the spin one components have a pole in p 2 and therefore propagate in time — i.e. they are dynamical variables. We see also that the limit m -> is not straightforward. In fact for m = the operator K^, does not have an inverse since it is proportional to a projection operator. This is precisely the manifestation of the gauge invariance of the massless theory. To solve our equations (16) for m = 0, one normally 'goes into' some particular gauge, i.e. we modify the equations of motion so as to introduce an operator K^(X) which does have an inverse. The simplest set of gauges are the covariant gauges which depend on a parameter A. We define Ar;(A)-( P 2 p (1) +^F (0 >)" which does have an inverse, which is ?;(a)=(V 1) +ap (0) )" p I* (33) (34) Of course, gauge invariance implies that any physical result must be indepen- dent of the gauge, i.e. independent of A. We saw before that as a consequence of the equations of motion for the massless theory the current j^ must be conserved which means P%(p) = (35) Substituting (34) into (20) and using (26) and (35) we can write for the case m = P m " i (p) P P (36) Any classical experiment which will involve the interaction of two sources — say j™ and j i2) will depend on ,-(D/V(2) ■(1)H A (2)_ J Jp 1 A » ~ p 2 (37) and is indeed independent of the gauge. We return now to the massive case and if in addition to the equations of motion (16) we postulate that the source is conserved — i.e. we assume equation (35) as a field equation we can write for the case m^O A^p)- p 2 — m 1 —p^u(p)=-¥ e1 m (38) 372 Uncertainty Principle and Foundations of Quantum Mechanics Assuming m is very small (specifically the Fourier components are such that m 2 «p 2 ) we find (39) This completes the proof, that as far as physical observations are concerned, the theory of a classical spin one field for small mass approaches the results for a massless spin one field. Let us now turn to the spin two case. Since h„ v is a symmetric tensor with ten components, we proceed by writing K* in terms of its spin two, spin one and two spin zero projection operators. In order that we may find the inverse of K easily we must find those two spin zero projection operators which are orthogonal. This is carried out in Appendix A and we can write Kt = [(p 2 -m 2 )Q (2) -m 2 Q m + X(p)Q? } + Y(p)Qf*C ( 40 > where Q (2) , Q m , X, Y, Q? ) and Q< 0) are defined in the Appendix. Neither X(p) nor Y(p) vanish so that we can invert K to find r"f- /-J— r> <2) -— O a) +— !— Q (0) +— !— O' 0) V (41) G -"-\p 2 -m 2Q m 2 ° + X(p) Uc+ Y{p) Uc )»„ Using the results of the Appendix we can also write G ^ = 2Tp^m r )\\ 8 ^~3 r '^ J \ 2m 2 ) 3m 3 m J (42) Since neither X(p) nor Y(p) vanish we see that only the spin two components have a pole in p 2 and thus only the five spin two fields are dynamical variables. Again we see that we cannot take the limit m -»0 in (41). As before, for m = 0, K is a combination of projection operators which do not span the space of symmetric second rank tensors. Therefore K does not have an inverse. In fact we find f or m = (43) K°t = (p 2 Q (2) -2p 2 Q (0) )Z where Q^ = ^-/W(V*-p a p'') (44) This is just a manifestation of the gauge invariance of the theory. We can solve for the field ft M „ by 'going into' a gauge. This means we modify the equations of motion so that the modified K will have an inverse. The simplest class of gauges are the covariant ones and we write K&Lo. Ax) = {p 2 O i2) -2p 2 Q i0) +J-Q W +l ) O X, (45) where Q° must be orthogonal to Q m and is oil p =p„p v p p The inverse of K is Feldman 373 (46) G;!(A ,A 1 )=( J 5 o <2) -Ao <0) +A 1 o (1) +A d ) c \p 2p /, Substituting for Q (2) and Q m from the Appendix we have G£(A„, A,) =^{(Sl-pJ"M -P,P P )-\^ V -MM'* -P"P P ) (47) + (a**/B)j + (A 1 O (1) + A d ^ (48) The result of any observation must be independent of the A,. If we have two sources T^ and T^l, their interaction is proportional to Tr(l)A*''I,(2)_ rp{V)lJLVf~,aPrj4Z) (49) We saw that for a massless theory the source must be conserved as a conse- quence of the field equations, i.e. we must have P a T afi =Q Now from the properties of Q (1) and Q° we have so that we may finally write for the interaction of two sources 2 l aP (50) (51) (52) Let us now return to the massive case. We find in the Appendix that for small m y=-2p 2 and v~ 3m 2p X + Y ~3m 4U So that we have for m small (5*0) \p m 3m I "P (53) (54) (55) (56) 374 Uncertainty Principle and Foundations of Quantum Mechanics If we choose sources such that equation (50) is an equation of motion, then using equation (51) we have for the interactions between two sources when the spin two field has a small but non-zero mass (57) T-(2) which using (51) again gives 1 2 i aB (58) This result can only be the same as (52) if the sources are traceless. This is not normally the case. The energy momentum tensor for electromagnetic radiation is traceless while it is not for matter. This would give rise to a discrepancy in the bending of light experiment if gravitation were a spin two, small mass theory. [See van Dam and Veltman (1970) and Boulware and Deser (1972).] By comparing equations (56) and (47) we see how we could modify the spin two massive theory to give the same results as the spin two massless theory. We need only add in a spin zero particle which in the limit of small m will contribute the term Ln(°w 2p 2U ^ to the propagator. This is most easily accomplished by choosing the a in equation (6a) not equal to one. We saw for a * 1, equation (6a) is an equation for six dynamical variables, one of which will be the extra spin zero particle. However, we see from the relative sign between the (2) and Q <0) term that the extra spin zero particle must be a ghost. 4. THE LAGRANGIAN AND CANONICAL VARIABLES In this section we construct those Lagrangians which lead to the equations of motion (5) and (6). We do this in order to find the variables canonical to the independent dynamical variables. Having done so, we are able to pass to the m -*• limit in order to see what happens to the apparently vanishing degrees of freedom. We shall discuss the spin one case first. Given the equations of motion (5) one can easily find a Lagrangian from which they are derived. We may write* 2(x) = \A a tfAp +WA a S p a A $ +j a A a (59) where V0-. 2! re fiap\ c ^ vfipk ~ (60) where A^meansd^A* *This is the usual Lagrangian with kinetic energy term -iF^JF"" where F„„ = d u A v - d^A^ Feldman 375 This Lagrangian is not unique. By using the antisymmetry property of the Levi-Civita tensor we can add total derivatives to the Lagrangian by adding any multiple of* where I A a A afi A l «,/3 1 A^^w^*" (61) (62) In fact there is no need that this extra piece be Lorentz invariant since it does not contribute to the equations of motion. More generally one can add to the Lagrangian £(x), J£ K (x) where #«(*)=2lK,,KflA.A-% or/3 (63) where K a , k' b are any set of parameters. Since we have already discovered that the field A is not a dynamical variable, we will choose that Lagrangian such that the variable canonical to A , namely II is identically zero. Now n„^ 8£ T where We have " Sd°A a £ T = £+£ lt Thus If we choose we find that and Sff-A" 21 ' tapA u v™B = -(d„A a -d a A M ) 8d°A°~° n =o n,- = -(A,.-d,A ) (64) (65) (66) (67) (68) (69) (70) (71) We have of course been using the summation convention for summing over repeated indices. However, since in what follows we shall be writing down non-covariant additions to the Lagrangian we now specifically indicate summations where needed. 376 Uncertainty Principle and Foundations of Quantum Mechanics Since A is a dependent variable we use the equations of motion for A to find Ilj entirely in terms of A ( and A,. The equation of motion (5) for a = gives where and finally A _ *Ai _ h m 2 -V 2 m 2 -V 2 -8^ n,-M'-^ (72) (73) (74) with The inverse of A is B,d, A, = ^-^r -i ^3, (75) (76) For m = 0, we see that A does not have an inverse and is in fact a projection operator. It is precisely the helicity one projection operator. This indicates that for m = there are only two canonical momenta which are the momenta canonical to the two helicity one dynamical variables. We have been assuming that j„. does not depend on the A^ in which case ; will commute with the A,. Accordingly, as far as the canonical commutation relations are concerned we can replace the II, by nr=-A f/ A y (77) We will drop the superscript c whenever there is no confusion. The canonical commutation relations are (78) or using (76) (A i (x),n / (y)) = J8;6 3 (x-y) (A'(x), A,-(y)) = -i(s;-^)s 3 (x-y> (79) Of course we can write (79) only in the case when m # 0. We can see what happens in the limit m ->• by writing (78) separately for the helicity one and helicity zero subspaces. In Appendix B we show that we can write Al =A\ n + A?» (80) where and A? = RfA>, Af = RfA j *H^) RW = dfd: Feldman 377 (81) (82) (83) where R (l) and R (0) are the helicity one and zero projection operators, respectively. Now, from (77) and for m small u < V^* V 2 / V 2 V 2 ' (84) (0) The commutation relations (78) can be written (Ar ) (x),A) 1) (y)) = -«< 1) 5 3 (x-y) and Let us define or [A,<°>(x),^Af(y)] = /i?f5 3 (x-y) 4,-, mdjA' di<f> = -mA\ (0) (85) (86) (87) (88) (89) Equation (86) becomes (4>(x),<i>(y)) = i8 3 (x-y) The commutation relations (85) are precisely those satisfied in the m = case by the independent helicity one fields. The canonical variables for the helicity zero are $(x) and <f>(x) given by (87). This is verified by (89). We now look at the equations of motion satisfied by AJ 1 ' and <$>. From equation (5) for a = i we have (90) (91) UAi-didjA' -d,A° + m 2 Ai = -ji where = 3^ 378 Uncertainty Principle and Foundations of Quantum Mechanics We substitute for A from (72) to obtain (92) We saw that for the case m = current must be conserved and A,-*!," (93) the helicity one projection operator, so that for m = we have DA^-yl" (94) and from (72) A ° v 2 V 2 For m ^ but small we have (95) A -7?(1) '"■ ffCOl (96) We substitute into (92) and project out the helicity one and zero parts and use (18) to obtain DAi 1) = -y1 1) (97) and m V (98) If in the massive case we assume current conservation as a further equation of motion we find as m -*■ D4> = (99) Accordingly, as m -*■ the helicity one modes satisfy exactly the same equations of motion and commutation relations as the helicity one modes in the massless case. The helicity zero mode is, however, decoupled. This is the sense in which the third degree of freedom disappears. We now turn to the case of spin two. From the equations of motion (6) we can deduce a Lagrangian, namely, 2(x) = fr%A^-\m 2 he(8Z8e-8&X+TZh% (100) where A£=S"^e ^ (101) Feldman 379 Of course as in the spin one case this Lagrangian is not unique and we can add terms of the form where A may be or 2 ,L "a^A hptjK a Kp K p Kfr a0,pcr A" *" = d»e^ x e°"' yX d''ri f> " d*~ix atByX. t.v per (102) (103) (104) As in the spin one case we choose that Lagrangian such that the variables Ilo canonical to the redundant fields h° p are identically zero. Consider 82__i where J£(x) is given by (100). This gives whereas, or 82 _ 8d°h° c ^0; a ~ 2E0ap\E Ojla- OO Ho — 2~(8Z8 s r — 8 s a 8")dnhs8a OtJj j . - r i r\ S M, a = 2(3 a « r - ^rrt J od ho (105) (106) (107) (108) (109) Accordingly, we must add to !£{x) some terms of the form (102) which will insure that ns=o (110) We may do this if we add to ££{x) Choosing = (.srh a ^h a v -d v h aix d^K) + (d a h a0 d h r r -d h a0 d a h^ <e T (x)=<£{x) + 2 K {x) (HI) (112) 380 Uncertainty Principle and Foundations of Quantum Mechanics we find, 110 Sd°h° a ° The other six canonical momenta will be given by U ' 8d°h) From (105) we find ^=a M-W*£+Wg-5(d,V W/t,o) Sd hj From (112) we obtain Finally Sd hj Tin = hi + VijWrh r °-tid- (dfro + djHio) (113) (114) (115) (116) (117) We eliminate the dependent variables, h i0 using the equations of motion. In equation (6) we set the indices a = i and /3 = and using (8) we obtain (-V 2 +m 2 )A^ r0 = T i0 +dM-dih r r (H8) where A, r is given by (75). Equation (6) with a = P = gives the constraint equation on the six % namely \uh" = '00 (119) Now multiply (118) by A} 1 and use (119) to obtain *« = -raf 9ti+ T 0i +^3,0^°)] (120) AM —V L /« -I We substitute /i , into (117) to obtain n iy = A 1 >A /s /i"+A^;+^^[(a,To / +a / To I )+^a 1 a/(^^ )] (121) In writing our commutation relations we will assume that T„. v is indepen- dent of ha and tig. Since the h u are restricted by the constraints (119), we must find the n - which are restricted to the same subspace. We make use of the helicity projection operators defined in Appendix B. A general six component tensor h it can be decomposed into its helicity components: two helicity two, two helicity one, one helicity zero and a second helicity zero. hii = (R m +R m +R i0) +R l0) ^h rs (122) Feldman 381 If we choose i? <0> to be the helicity zero projection operator, A,-,A ra (0)rs _ **■>}' DW> rs = A m „A" (123) we may write using (119) h ii = (R w +R w +R m ) r iJh rs + ■ oo (124) (125) (126) (127) m „A V -m Let us define the projection operator P" such that We may write the canonical commutation relations as [Mx),n re (y)] = /P£S 3 (x-y) where U..=p"} n h 1 rt ij ± i] Tl mn I n„=/>3i« J Equivalently, we may write the commutation relations in the various helicity subspaces as [^f ) ,m? ) ]=/i?^5 3 (x-y) (128) where a =0,1,2 and h^ = R^h rs , etc. (129) Using the results of Appendix B, we can find the commutation relations for small m. Using (B.ll) we may write (in coordinate space) where Note, hP = ^0ti n +art n ) h? i —a{r h .+^)h' a'Ai" = o (130) (131) (132) so that the h \ are the helicity one components of a spin one field. Also m; } =-h\ (i) (133) 382 Uncertainty Principle and Foundations of Quantum Mechanics Substituting (B.31) into (128) we have Now take (d/dx,)(d/dy s ) of (134) and we obtain Define then -Jim (1) V2m Ai=-=T-hi = — =7-3, {Vis+^fjl (Mx), A r (y)) = -i(r,ir + d -^f) 8 3 (*-y) (134) (135) (136) (137) (138) where R? is denned by (82). Using (B.34) and (B.25) we have for small m, where Similarly from (B.33) h fjj^> h ^ h (0) = d -£h rs n " ~2\V 2 ) V 2 " Substituting into (128) and projecting out with S <0) we have [cw,|(^) 2 ^(y)]-^« 3 ('-y) We now operate on (142) with (139) (140) (141) (142) a j__d d_ dXj dXj dy r dy s to obtain [vV, |(^)V^°] = /W(x-y) (143) Feldman 383 Define and (143) becomes (144) [<f>(x),4>(y)] = iS 3 (x-y) (145) Of course the commutation relation (128) for a = 2 is the same for both m = and m # i.e. (h?,Il i2) ) = iR%8 3 (x-y) (146) As m -* the dynamical variables associated with helicity one and zero are the fields A t and <l> defined by (136) and (144), respectively. Next we find their equations of motion. Operate on equation (6) with d a d e which gives (d a d p h e a -Dh) = — d"d l3 Tt m Next take the trace of (6) and use (147) to give *:-^'- 2 (TZ-jpfdeTi) (147) (148) (149) 3w"\ " m We use (147) and (148) in equation (6) when a = /, fi =/ to obtain Oka - dtdji y- djdji "i+m 2 ^, ~ Tii+ Jm i \ T ^~m rT ) A K+ m 21 ) We substitute for h oi using (120) giving (a + m 2 )(A ir A is - ( ^2 )2 ) h rs _( did,d r d s \ djd/ f _ d„d^ v \ -\ AirAis ~(m 2 -V 2 ) 2 ) T + 3m 2 V» m 2 ' J Will rpll. , "lt"V rp^lA 3V" m 2 ' / (150) Using the formula for A, r A /s given by (B.24) the left side of equation (150) can be written for small m as □ (* (2) m R w- S ^+S / iirs (151) 384 Uncertainty Principle and Foundations of Quantum Mechanics We may now take projections of (150) to obtain the equations for the various u.v.^, km* Th* S° nrniection will iust give the constraint equation (119) in we may nuw w^ piuj^uv.^ — v--~, - , 1im helicity fields. The 5° projection will just give the constraint equation (119) the m ■* limit. Since Rf rS d r d s =Rf rS Vrs=Rf rS dr=0 we have where Using nh?=tf T< 2 >— n(2)rs T 1 ij — -TV ij 1 rs (152) (153) (154) (155) (156) (157) (158) R\j )r %d s =R ( i )rs Vrs = and equations (133) and (136) we have for small m si 2 V 2 L rn x m li If we choose sources which are conserved i.e. ^7^ = then as m -»0, equation (156) becomes DA, = This means the dynamical variables corresponding to the helicity one fields become decoupled in the m -> limit if the source is conserved. We now project out the helicity zero field from equation (150), by multiply- ing the equation by S (0) , taking the trace and using the definition (144), to obtain for small m 1 (159) Again, assuming (157) and letting m -» we find Q* = --^ (160) Here, we see that in the limit, the helicity zero field does not decouple if T£#0 (161) Feldman 385 Although the equation (153) together with the constraint equations (120) and (119) (with source conserved and m = 0) make up the content of the massless theory, the fact that the helicity zero field <f> does not decouple shows that the m -* theory is not the same as the massless theory. 5. CONCLUSIONS The main conclusion of this work concerns the difference in behaviour between massive spin one and spin two theories as the masses approach zero. The limit as the mass approaches zero of a theory of a spin one field coupled to a conserved source gives the same observational results (in the classical limit or tree approximation limit) as the theory of massless spin one. However the same limit of a spin two theory coupled to a conserved source can give the same results only if the source is traceless in the limit. This is usually not the case if we expect the spin two source to be the energy-momentum tensor. These results have been obtained before. By making use of the properties of the Levi-Civita tensor and constructing projection operators in spin and helicity space we have followed through in detail the properties of the various dynamical variables as the mass m becomes small. From the equations of motion we have seen how the number of dynamical variables change from 2s + 1 to 2 (where s = \ or 2) depending on whether m/0orm = 0. This is due to the gauge invariance which the m = theories possess. The presence or absence of gauge invariance appears again in our construction of the propagators. For m / the equations of motion can be inverted to give a unique propagator. For m = 0, due to the gauge invariance, the equations of motion are proportional to spin projection operators and can only be inverted by choosing some gauge. We see that for spin one, a conserved source allows the m -» limit to be taken and the m = results are reproduced for physical observables. For the spin two case a conserved source alone does not reproduce the same results as the m = theory. One would have to assume the source to be traceless in addition. An examination of the dynamical variables and their canonical conjugates allows us to see what happens to the supposed disappearing degrees of freedom as m -> 0. For spin one we find that the extra helicity zero degree of freedom disappears only in the sense that it is decoupled if the source is conserved. For spin two the extra helicity one degrees of freedom are decoupled if the source is conserved but the helicity zero component does not disappear but is coupled to the trace of the source. This again shows why the m -» limit for spin two does not reproduce the same observable results as the m = case. ACKNOWLEDGMENTS The author would like to thank Dr. T. Fulton for many discussions which aroused his interest in the problem. He would also like to thank Dr. Abdus 386 Uncertainty Principle and Foundations of Quantum Mechanics Salam for his hospitality at the Institute for Theoretical Physics, Trieste, where much of this work was done. This work was supported partially by the N.S.F. APPENDIX A In this Appendix we construct the spin projection operators in the space of second rank symmetric tensors in Lorentz space. A symmetric tensor in space time has ten components and its spin decompos- ition is into spin two, spin one and two spin zero. Its spin two components h^, of which there are five, must satisfy p"h% = h™» = (A.D Thus the projection operator Qf v " e must also have the properties n n n (2)<*P _ n (2)^P _ Q (A.2) and it must be symmetric in <ji, v) and (a, 0). One easily finds Q%* = K(s:-pJ a )(st-pJ p ) + 0* «* ")] -toi„-0j.)(v--W) (A - 3) (Q (2) ) 2 = Q <2) ( A - 4 ) i.e. n (2)pcr n (2)aP _ n (2)a0 The spin one component h™ must have the properties that and p»h { ? = Q The projection operator Q™" which must be orthogonal to Q l2 l aP is found to be Q% a ' = 1 ipJ a (S' v -p^)+pJ (Sl-p^") + ^^^ = $pj a 8 e .+pj l3 8:+ o* «* »)] -iPM a f ( A - 8 ) (A.5) (A.6) (A.7) (A.9) and again and Q (2) Q (1) = (A. 10) Since the symmetric tensor h^ is a reducible representation of the Lorentz group, there is no unique decomposition of the two spin zero parts. In fact there is generally a one parameter infinity of two orthogonal projection operators. (Q (1) ) 2 = (1) Let us define two projection operators, and Now BJl-PvPvP p A Z = A B 2 = B Feldman 387 (A. 11) (A. 12) (A. 13) (A. 14) (A. 15) (A. 16) (A.17) (A. 18) AQ (2) = AQ m = BQ m = BQ m = However AB*0 Let us define C=AB+BA Tnatis Ct = kv^"P +Pj,V ae ) and we have the following identities ABA = A/4 (A. 19) BAB=B/A (A.20) AC+ CA = A/2 + C (A.21) BC+CB=B/2 + C (A.22) C 2 = (A+B + Q/4 (A.23) We find that the most general spin zero projection operator can be written Q™ = aA+bB-cC (A.24) where a and b are the two roots of 4x 2 -2(c + 2)x+c 2 = (A.25) for any c. The roots will be real provided 2^c^-| (A.26) For the case c = 0, either a or b = 0, i.e. A and B are projection operators. Another special case is c=| (A.27) 388 Uncertainty Principle and Foundations of Quantum Mechanics in which case <^%-Q (0) 4(A + f-c) i.e. -|(0)a/3 _ 1 (A.28) (A.29) For a given Of one can find a second projection operator Of orthogonal to Of O< 0) =.aA+6£-cC where we find a+d -b+b-c+c =f Using (A. 11) and (A. 12) one can demonstrate that (O (2) + O (1) + O< 0) + 6f)^ = §(S^+<5^) (A.30) (A.31) (A.32) We now write the operator K% defined by equation (19) in terms of these projection operators. First, we have K£ = [(p 2 -m 2 )Q™-m 2 Q m -i(p 2 -m 2 )A-l(p 2 + 2m 2 )B + U2p 2 + m 2 )C]% (A-33) We must now find X(p), y(p)andc(p)suchthattheA,B, C part of (A33)can be written XQ^+YQ™ To do this we must solve the equations aX+aY=-*(p 2 -m 2 ) bX+bY=-l(p 2 + 2m 2 ) cX+cY=-1(2p 2 + m 2 ) (A.34) (A.35) (A.36) (A.37) where we also use (A.31) and (A.25). We find that X and Y are the two solutions for u of u 2 + 2u(p 2 -m 2 )-3m 4 = Thus XY=-3m* (A.38) (A.39) Feldman 389 This implies neither X nor Y vanishes for any value of p 2 . We take Y as For small m, and In addition y= -(p 2 -m 2 )+[(p 2 -m 2 ) 2 + 3m 4 ] i 2p 2 X= -(p 2 -m 2 )-[(p 2 -m 2 ) 2 + 3m 4 f = — 2p 2 for small m 4p 2 + 2m 2 2 a = — 3 X-Y 3 4 (p 2 -m 2 ) 2 3 X-Y 3 and Indeed as m -*■ 0, 2p 2 -4m 2 2 b = + 3-JTY- + 3 c,a-*t and b^\ when ;O (0) is given by (A.29) and 6f-+6 (0) = --B Thus as m -►0 XQ «» + YQ«»->- -2p 2 Q m which produce equation (43). For the inverse of K we have G afi =( - 1 )<2)__L d) , \p~ — m~ m X We look for the small m limit of o^ + o^ (A.40) (A.41) (A.42) (A.43) (A.44) (A.45) (A.46) (A.47) (A.48) (A.49) (A.50) (A.51) (A.52) 390 Uncertainty Principle and Foundations of Quantum Mechanics We have Q< 0) <% 0) JaY+aX)A+(bY+bX)B-(cY+cX)C X Y XY Now aY+dX=+i(p 2 -m 2 )+i(X+Y) = (A.54) 6r+Mr=-!(p 2 -4m 2 )+|(AT+y) = -2(p 2 -2m 2 ) (A.55) cY+cX = i(p 2 + 2m 2 ) + l(X+Y) = 4m 2 Using (A. 3 9) we have ^-+^- = -^[(p 2 -2m 2 )S + 2m 2 C] (A.56) Thus for small m ( \<^j£ls = |E> (A.57) X Y 3m 4 3m This gives equation (55) for small m. APPENDIX B In this Appendix we construct the helicity projection operators in the space of vectors and second rank symmetric tensors in Euclidean three-space. A vector A, in three-space has three components: two helicity one compo- nents, A ( l\ and one helicity zero component A ( °\ The helicity one components are combinations of the transverse components of A, and so must satisfy (in momentum space) p'A^O The projection operator R\P into this space is R l ^ = (vu+PiPi) (B.l) (B.2) where and PtP =-1 (B.3) R£ ) R Wkl = R\ 1) ' (B.4) The helicity zero projection operator R^ must be orthogonal to R m and is Rf = -m (B.5) A symmetric tensor in three-space has six components and its helicity decomposition is into helicity two components (2), helicity one components (2) Feldman 391 and two independent helicity zero components. The helicity two components h i ; 2) must satisfy the relations p%f = h i2)i = (B.6) The projection operator R\f )rs into this space is (in momentum space) r? rs = Wi+mw+Pp) + (»■ **m-kv„ +m)(v rs +pT) (b.7> and, of course, (R (2) ) 2 = R (2) (B.8) The helicity one components h™ must have the property hQ^dJiV + drf" (B.9) and p'hf^O (B.10) We find R\l )rs = -kpp'iS'i+pW+PiP'W+pWHi^j)] (B.H) (R (1) ) 2 = R a) (B.12) and R (1) R m = (B.13) Just as in Lorentz space (in Appendix A) there will be an infinity of the two helicity zero projection operators which can be made up of linear combinations of the operators hiv rs ,mp r p s , -Hw'p'+pm") . (B.14) However from equation (1 19) we know which helicity zero component of /r„ is not a dynamical variable, namely the one which is proportional to Ajjh" (B.15) The helicity zero projection operator which projects this field out of /i„ is Rf rs = A ii A rs /A mn A m " (B.16) Thus the helicity zero component which will be a dynamical variable will be obtained by finding that helicity zero projection operator R (0) which is orthogonal to R (0) . This operator is easily found to be n(0)rs_ m I R " ~2Z\ 2p 2 + 3m 2 m v«+ — —* — PiPijyn + — ^2 — ppJ (B.17) where Z = (2/ + 4p 2 m 2 + 3m 4 ) (B.18) 392 Uncertainty Principle and Foundations ol Quantum Mechanics Accordingly, the dynamical helicity variables are hf=Rf rs h rs with a = 0,1, 2 and the variables canonical to these will be the with IT„ given by (121) We have since We may write R\r° A rs =0 R (o) R° = Ylt ) = R ( i ? rs Ar k K l ti k ' (B.19) (B.20) (B.21) (B.22) (B.23) where we have neglected the terms depending on the source r* since they will not contribute to the canonical commutation relations. We have A*AjiM*<»+^W^)> + $ m ] f (B - 24) L p +m \p +m / Jrski where and M0)kl_ * * *k*l Srs =PrP s P P ST'= l 2(Vr S +P&)(V k '+p k p') Using (119) and neglecting source terms we may write m (Vki +PkPi)h kl = 2 + m 2 WV»' Also from (B.21) we may write »(0)ra, Rf"(Vrs+PrPs)=-^2Rf rS PrP ! p +m (B.25) (B.26) (B.27) (B.28) This gives Finally we have (R (0) S°) k !h kl = \{-^){R°S^!h kl (B.29) tf? = Rf"h rs m i >= m m p +m -K a n rs ~ 7 a ii n, ij 2 , 2 JV '/ "" „ 2 '' rs (B.30) (B.31) Feldman 393 for small m, and Substituting (B.17) for R (0) and keeping only leading terms for small m we obtain U (0) 3(m_Y v (ow " 2\p 2 ) " " Similarly for small m we may write (B.33) (B.34) REFERENCES Boulware, D. G. and Deser, S. (1972) Phys. Rev., D6, 3368. Deser, S. (1970) /. Gen. Rel. Grav., 1, 9. Fierz, M. and Pauli, W. (1939) Proc. Roy. Soc., 173, 211. Van Dam, H. and Veltman, M. (1970) Nucl. Phys., B22, 397. Relativistic Electromagnetic Interaction Without Quantum Electrodynamics JOHN H. DETRICH University of Wisconsin, Madison, Wisconsin, U.S.A. and CLEMENS C. J. ROOTHAAN University of Chicago, Chicago, Illinois, U.S.A., and Ohio State University, Columbus, Ohio, U.S.A. 1. INTRODUCTION In the extension of the Dirac theory of the electron to many-electron systems, the most obvious difficulty is a satisfactory treatment of relativistic elec- tromagnetic interactions. The first such treatment was worked out many years ago by Breit (1929, 1930, 1932). A more elaborate derivation of Breit's results, based on quantum electrodynamics, was given by Bethe and Salpeter (1957). While quantum electrodynamics is thus capable of describing relativistic electromagnetic interactions, there are in principle and in practice still serious difficulties in this approach. One reason for this situation is that relativistic effects are intrinsically far from simple. However, quantum electrodynamics complicates matters by yielding results which are not readily absorbed in the framework within which one naturally deals with many-electron problems. Thus quantum elec- trodynamics is organized around a perturbation treatment that regards quan- tum electrodynamical effects as weak. But in many-electron systems the quantum electrodynamical effects include the Coulomb interactions between electrons, and these cannot comfortably be regarded as weak. Again, quantum electrodynamics tends to sidestep the question of the influence of quantum electrodynamical effects on the wave function. Indeed, it may be inconsistent to take this up, since one of these effects is the interaction of an electron with its own field, and even a finite retarded self-action compromises the definition of a wave function (Feynman, 1948). Treatments of many-electron systems, on the other hand, customarily are formulated in terms of wave functions. These considerations suggest that a vantage point somewhat different from that offered by quantum electrodynamics may be useful in treating relativistic 395 396 Uncertainty Principle and Foundations of Quantum Mechanics electromagnetic interactions. We present in this paper a formalism that pro- vides such an alternative approach. Since one of our principal objectives is to deal with relativistic electromagne- tic interactions in terms of a vocabulary that differs from the one customary in quantum electrodynamics, we begin in Section 1 and 2 with a brief outline of this vocabulary. Our extensive use of matrix elements is in the spirit of the original formulation of quantum mechanics by Heisenberg (1925), Born and Jordan (1925), Born, Heisenberg and Jordan (1926) and Dirac (1926). Section 3 reviews Dirac's treatment (1928) of a single relativistic electron. In Section 4 we take up the actual treatment of relativistic electromagnetic interactions. Our considerations apply to an isolated system containing an arbitrary number of Dirac particles. Of course our assumption that the system is not subject to external disturbances is only an approximation, since real physical systems can always interact with external objects through emission or absorbtion of radiation or other means. Here we accept the limitation this approximation imposes on our treatment, since our interest is in interactions between particles. This is necessary in order to deal with a total energy for the system which is conserved and therefore capable of precise definition. We deal here in terms of many-particle wave functions, and this implies that the number of particles remains constant. Thus no provision is made for the possible creation and/or annihilation of electron-positron pairs. No additional approximation need be adopted in treating the relativistic electromagnetic interactions. We find that these interactions can be handled in closed form, using the version of classical electrodynamics advocated by Wheeler and Feynman (1945, 1949). Just as we desire, this formulation deals directly in terms of interparticle interactions, without reference to a mediating electromagnetic field. In such a theory there is nothing that compels one to include the action of a particle on itself, as Wheeler and Feynman demonstrate. Following them, we omit such self-actions, thereby avoiding all of the difficul- ties associated with those terms in conventional quantum electrodynamics. Quantum mechanical treatments based on the electrodynamics of Wheeler and Feynman have previously been presented by Hoyle and Narlikar (1974), and also by Davies (1971, 1972). However there is little in this work that bears directly on our approach: while these treatments differ in some respects from conventional quantum electrodynamics, they adopt similar vantage points and vocabularies. Thus, from our point of view, they have essentially the same drawbacks as conventional quantum electrodynamics, including, rather unex- pectedly, the occurrence of self-interactions. In the remaining sections we take up the Pauli approximation. The results are significant in their own right, even though we simply reproduce long- accepted theoretical expressions. Careful examination of previous derivations reveals that they are somewhat inadequate. One problem is that these deriva- tions start from an approximate treatment of the interactions. This is hardly to be avoided in a treatment based on quantum electrodynamics since this theory is organized around a perturbation treatment, but it places one in the awkward Detrich and Roothaan 397 position of extracting one approximation from another. A more severe prob- lem comes from the fact that quantum electrodynamics avoids evaluating electrodynamical effects on the wave function. To make up for this, one is obliged to adopt some more or less arbitrary assumptions for such effects. For example, Breit (1929, 1930, 1932) found that part of his approximate interac- tion term should not be included in the Hamiltonian used to determine the wave function, even though its expectation value contributes to the energy. A derivation of the Pauli approximation which is free of such difficulties is presented in Section 6. InSection 7 we assess the significance of our results, and present a few very general suggestions for further work along these lines. 2. QUANTIZATION OF A NON-RELATTVISTIC HAMILTONIAN SYSTEM OF PARTICLES For the classical description of a system of N particles, let the cartesian coordinates and conjugate momenta be denoted by r M and p^, respectively, fi = 1, 2, . . . , N; we shall denote them collectively by (1) r = (r u r 2 , . . . , r^) 1 P JV = (Pi>P2, ■■■,Pn)\ There exists a Hamiltonian function H(r N ,p N ); the time evolution of the system is governed by the canonical equations of motion • = dH '""ap. dH ~dr, p ) (2) From these equations one can in principle calculate r M (f) and p M (f) for any time t, given a set of initial values r^(0), p^O). To describe the N-particle system quantum mechanically, we replace the coordinates and momenta by appropriate operators. It is customary to use for these operators the same symbols as for the classical quantities they replace. The coordinates and momenta must satisfy the well-known commutation relations, which can be written in the two equivalent forms IP, a-r„ (3) la • p M , where [A, Bj = AB - BA is the commutator of the operators A and B, and a is any constant vector. In general, algebraic equations between classical quantities are reinterpreted as operator equations. This process is usually not unique, since any product involving conjugate coordinates and momenta becomes dependent on the 398 Uncertainty Principle and Foundations of Quantum Mechanics order of the factors, because of the commutation relations (3). Often the requirement that any observable must be represented by a Hermitian operator resolves the ambiguity. With that understanding the Hamiltonian operator is now defined in terms of the operators r M , p„ by the functional expression H(t N ,p N ). The replacement of the classical variables by operators leaves undefined the operators corresponding to time derivatives of those variables. The time derivative of an operator A which does not explicitly depend on the time is now defined by A = ih~\H, A J (4) If equation (4) is used to evaluate r M and p M , one can prove, using the commutation relations (3), that the canonical equations (2) are now valid as operator equations. In the Schrodinger representation, which we shall use throughout this paper, the coordinates are taken over without change as multiplicative operators. The momenta must then be defined by p M = -«7tV M = -ih dr u (5) in order that the commutation relations (3) are satisfied. While classically the time evolution of the system is specified through the explicit functions of the time r M (f) and p M (f), quantum mechanically the time evolution is expressed in the time-dependent wave function ^(r N , which must satisfy the time-dependent Schrodinger equation ih— = HV dt (6) This equation can be satisfied by wave functions representing stationary states, namely a>=E/h (7) where E is the energy of the stationary state represented by ¥. The time- independent wave function * satisfies the time-independent Schrodinger equa- tion HV = EV (8) which states that ¥ and E must be eigenfunction and eigenvalue, respectively, of the operator H. The determination of the entire set of eigenvalues and corresponding eigenfunctions of a given Hamiltonian constitutes the central problem of non-relativistic quantum mechanics. Conceptually, the simplest systems are Detrich and Roothaan 399 those for which the entire eigenvalue spectrum is discrete, each eigenvalue having only a finite number of linearly independent eigenfunctions; the har- monic oscillator is the best known example. For the all-discrete case it is natural to denote the stationary state wave functions by <»m = EJh (7a) where m is an appropriate discrete index to label the distinct energies, while a labels degenerate wave functions if necessary. For the time-independent wave functions we have of course //^ ma =£„,¥„„, (8a) It is well-known that degeneracy is a necessary consequence of the symmetry of the problem. A symmetry operator is by definition a unitary operator which commutes with the Hamiltonian. The symmetry operators form a group; the wave functions V^ which belong to the energy E m transform among them- selves under symmetry operations, and the transformation matrices constitute a representation of the group. If we allow for the possibility E m =E„, m # n, then it is no loss of generality to assume that the representation associated with a level E m is always irreducible ; the particular representation to which the.set ^^ belongs is called its symmetry species. The normal situation is that the E m are distinct; whenever E m =E n , m^n occurs, it is called accidental degener- acy. Consideration of the symmetry properties of the wave function leads to the so-called good quantum numbers ; a corresponding compound labelling of the energy and wave functions is usually adopted. For our present purposes, this is not necessary; we label the energies with a single index m, and recognize non-accidental degeneracy of each level by the wave function label ma. It is customary to postulate that the time-independent functions t M consti- tute a complete orthonormal basis in the Hilbert space of functions used to describe the N-particle system at any given time t. The orthonormality is expressed by (Vm* |¥*> = \ dr^LCr")^ (O = S„ = 8 mn 8 aP (9) where J dr^ denotes 3N-dimensional integration over all particle coordinates. Completeness of the base is conveniently expressed in Dirac notation by ^=X|*maX*,J (10) ma where $ is the identity operator; in ordinary functional notation completeness is expressed by ma ma ^ (10a) where V is an arbitrary function within certain reasonable constraints. This completeness postulate is fundamental to quantum mechanics; hence only 400 Uncertainty Principle and Foundations of Quantum Mechanics Hamiltonians which yield as their eigenfunctions complete sets can be used to describe physical systems. Most systems of physical interest do not fall in the all-discrete category. The other extreme occurs for a free particle, where the entire spectrum consists of all E^O, so that it is all -continuous. Most common is the mixed discrete- continuous case; for instance, the hydrogen atom has continuous eigenvalues E 2*0, and discrete eigenvalues E m = -R/m 2 , m = 1, 2, . . . , where R is the Rydberg constant. Clearly, equations (7a), (9), (10) and (10a) must now be reinterpreted. In general, the index m has a discrete and a continuous range, while a can be maintained as a discrete index. With this understanding, equation (7a) remains valid as it stands. In equation (9), if either morn is in the discrete range, no change is required; but if both m and n are in the continuum, 8 must beVeplaced by a Dirac delta-function. In equations (10) and (10a) we nTust deal wu* the sum I ma = I m I a - It is clear that I„ is to be retained as a discrete sumAhowever £ m must be understood as a discrete sum and/or integration, depending on whether E m is in the discrete or continuous range, respectively. In our further deliberations we shall restrict ourselves to the all-discrete notation. This, however, should not limit the validity of our final results, since in the latter all references to stationary states will have disappeared. We use the stationary states to define the time -dependent and time- independent matrix elements of an operator, namely A m „,„,W = <^|A|^> = {dr^L(r JV ,0A(r-,p-)^(r- ) 01 A ma , n0 -W^A |^> = } dr N VUr N )A (r", p N )V n0 (r") J Clearly, A ma ^ and A ma ^ are related by A ma ,„p\t) — A man p& m " I o mn =<o m -(o n J (12) For m = n the time dependence drops out, and we write A ma , m p (0 = A mam p = A ma p (i-J) We note that the matrix elements of the Hamiltonian are given by H ma , n p = E m 8 marn p \L4) The matrix elements of the product of two operators are defined by -v r r {15) py this is easily proved using equations (7a), (10) and (11). Detrich and Roothaan 401 For the time derivatives of the matrix elements we invoke the definition in terms of the commutator with the Hamiltonian, equation (4); we readily find s^ma,np ~" ^^mn^m<x,m3 (16) We note that the first equation (16) would also result if we defined A manp {t) as dAma,„0 (t)/dt; this is of course the justification, in retrospect, for the definition of the time derivative of an operator, equation (4). The second equation (16) extends this definition by providing a time derivative for the matrix elements A ma ,n0, even though these are time-independent quantities. Quantization of classical relations can now be stated in terms of matrix elements, preferably the time-independent ones. One simply replaces classical quantities by the corresponding matrix elements, honouring the rules for products and time derivatives, equations (15) and (16); of course ambiguities due to the order of factors have to be resolved in the same manner as for the operators. Clearly, quantization in terms of matrix elements is completely equivalent to quantization in terms of operators, and vice versa. From the matrix elements and wave functions one can recover the operators. In general "■ 2- L, I ^moMm«,iijS\ * np\ ma n& For the Hamiltonian we have the special formula H = V hlr \P Air I * l Lj * ma / J -'m \ * ma (17) (18) In addition to the operators for the positions and momenta of the particles, we shall need operators representing the charge and current densities associated with the particles. We define the charge density associated with the /ith particle by means of P^) = e^-t^) (19) Here e^ is the charge of the p. th particle, and 8 (r' - r^ ) is the three-dimensional Dirac delta-function, so that the relation m- > S(r-r')/(r') (20) holds for any function /(r) which is reasonably well-behaved. In equation (19), the space indicated by r' designates the position of an external electric probe, and thus may be regarded as the observer's space. In the following, we shall use r', r" for electromagnetic probe positions, and r M , r„ for particle positions, or simply r for the position of a single particle. In conjunction with the charge density we define the current density operator by means of j M (r') = (2c)g- 1 ^,p M (r')} (21) (22) 402 Uncertainty Principle and Foundations of Quantum Mechanics where in general {A, B} = AB +BA is the anticommutator of the operators A and B, and the velocity r„ is defined by equation (4), specifically r M = ift- 1 [H,rJ Since r„ and p„(r') are both Hermitian, the symmetric product on the right- hand side of equation (21) guarantees that j. (r') is also Hermitian. Note that we have used the electromagnetic rather than the electrostatic definition of the current density; j^r') has the same dimension as p^r'). Up to this point we have not made any use of the specific form of the Hamiltonian. For our system of point particles the non-relativistic Hamiltonian is H(r N ,p N ) = T(p N )+V(t N ) (23) where T(p N ) = I(2m M r 1 p^p^ (24) w is the kinetic energy, and V(r N ) is the potential energy. We find easily for the velocity of the p.th particle r, = m-p M (25) and therefore for the current density Ur') = (2m |t c)- 1 {p |W p M (r')} ( 26 ) We are now in a position to demonstrate that the charge and current densities obey the equation of continuity, which also may be called the law of charge conservation. Let / M =/(r |1 ) be an arbitrary function of the position r„. Clearly /„ commutes with V(r N ), hence mfJ~lT,fJ = (27) Using equations (4), (5) and (24) we can derive from this the operator equation /,- (2m.rU • (VJm) + (V m ) -PJ = ° (28) where (VJ„) is another function of position, that is, the operator V M does not act beyond the parentheses. On the other hand, taking matrix elements of equation (27) we find (E m -E n )(^ ma \f^ np )-(2m IJi r\^ ma \^ •p M ,/J|M<^> = (29) Obviously, equations (28) and (29) are completely equivalent. If we take / M = p„{t') in equation (28), we can replace V M by -V, and obtain, using also equation (26), P>') + cV'-Ur') = (30) We recognize equation (30) as the equation of continuity for the pth particle; note that charge conservation holds for each particle separately. The unusual notation p„ rather than dpjdt is due to the fact that the observer's space r' was Detrich and Roothaan 403 introduced as a parametric variable. Another useful form of the equation of continuity is obtained by taking the matrix elements of equation (30); the result is '^mnP/i,ma,n/3 ( r ) + » ' ln.ma.nP ( r ) — where the wavenumbers k mn are defined by k mn = <o mn /c = (E m -E n )/hc (31) (32) Finally, we make the observation that equations (27) to (31) are all equival- ent. For instance, equation (29) is easily derived from equation (31) by multiplying the latter by /(r') and integrating over r'. Note, however, that equations (27) to (29) are only meaningful with reference to the non-relativistic Hamiltonian; equations (30) and (31) contain the charge and current densities formally, and may therefore be valid for other Hamiltonians as well. 3. DIRAC THEORY FOR A SINGLE RELATTVISTIC PARTICLE In the one-particle relativistic quantum mechanics of Dirac the wave function is generalized to a four-component spinor; in particular, for the stationary states we have Vs, ma (r,t) = V s , ma (r)e~ s = l,2,3,4 (33) It is useful to consider the index s as another variable in the wave function; it is obviously a discrete, variable. For most formal considerations, the index s can be suppressed. When that is done, we understand that ¥ and ¥ represent column vectors ; the correspond- ing Hermitian conjugate row vectors are designated by ¥* and ¥*, respectively. Product formation is to be treated according to the rules of matrix algebra; ty*<& implies summation over spinor components, and ^ f «I>* is a 4 x 4 matrix. The latter is an example of an operator which acts on the discrete variable; in general, an operator A is a 4 x 4 matrix with elements A st . If an operator does not act on the discrete variable, as, for instance, the position r or the momentum p, the 4x4 identity matrix is implied as a factor to make it a genuine operator in the world of spinors; we say that the operator is Dirac -diagonal. It is to be noted that the matrix structure of operators due to the spinor character of the wave functions is separate and distinct from the formation of matrix elements with respect to stationary state wave functions. The latter are defined, as before, by equations (11); however, with the present meaning of wave functions and operators, the integrals in equations (11) contain summa- tions over spinor components as well as integrations. 404 Uncertainty Principle and Foundations of Quantum Mechanics In the Dirac theory a central role is played by the set of Dirac matrices a x , a y , a 2 , |8 satisfying a* = a r a. a* = a, /3* = = a 2 z = fi 2 = I {a x , a y } = !«„ « z } = {a y , a J = {<*„ ft\ = {<*„ ft = l«» 0> = ° (34) where J is the 4 x 4 identity matrix. The equations (34) express that the Dirac matrices are unitary Hermitian and anticommute with one another. While any choice of matrices satisfying equations (34) is acceptable, a commonly chosen representation is given by a r = a, = (35) It can be shown that any other choice satisfying equations (34) can be transformed into the form (35) by a similarity transformation, which, with a corresponding transformation of the spinor wave functions, yields the same physical results. With the help of the operators a, we can now define the Dirac Hamiltonian H D = pm'c 2 + cfpc+V(t) (36) where V(r) is the potential energy of the particle. Note that both p and V(r) do not act on the discrete variable. We wrote m' for the mass of the particle, rather than m, in order to avoid confusion with the wave function index m. With the Dirac Hamiltonian in hand, and the interpretation of the wave functions as spinors, we can now translate most of the formalism of the preceding section into the proper one-particle relativistic equivalent. Up to and including equation (22) we only need to drop the individual particle labels, and replace H by H D . The specific definition of the Hamiltonian, equation (23), is replaced by equation (36). In lieu of equations (25) and (26) we now get for the velocity and current density (37) (38) = ac j(r') = ap(r') We can again demonstrate the validity of the equation of continuity. We note that a function of position /(r) commutes with V and with fim'c 1 , hence (39) [H D ,/l-c[o-p,/l = Detrich and Roothaan 405 Using equations (4) and (5) we can derive from this the operator equation /-ca-(V/) = (40) On the other hand, taking matrix elements of equation (39) we obtain (£ m -£ n )<^ mo |/|^)-c<^ m „||[a.p,/]I|^) = (41) Equations (40) and (41) are completely equivalent. The equation of continuity follows readily in this case if we specify/ = p(r') in equation (40), make use of equation (38) for the current density, and observe that we can replace V by — V: p(r') + cV'-j(r') = (42) or, taking matrix elements, ik mn Pnu,,np ('') + V • j ma ,„0 (r') = (43) Equations (42) and (43) are identical with the non-relativistic equations (30) and (31), if we omit the particle index fi in the latter. Note however that the resemblance is a formal one, since the current densities are given by different expressions, equations (26) and (38), in the two cases. This also accounts for the fact that equations (40) and (41) cannot be obtained from equations (28) and (29) by dropping the particle index p.. 4. QUANTIZATION OF A SYSTEM OF ELECTROMAGNETIC PARTICLES The behaviour and mutual interaction of electrically charged particles can, in first approximation, be stated in terms of electrostatic forces only. If the external field is time-independent, we have a Hamiltonian system, and the quantization sketched in Section 2 applies. However if we want to do justice to the fact that we are dealing with currents as well as charges, and magnetic as well as electric fields, the system is no longer Hamiltonian: the energy cannot be expressed only in terms of the instantaneous positions and momenta of the particles, since mutual interaction of the particles involves retarded and/or advanced potentials. Clearly it is a desirable goal to reformulate the problem in a Hamiltonian manner: this is the approach taken in the development of quantum elec- trodynamics. In that approach, new dynamical variables are introduced which describe the electromagnetic field, and the combined system of particles and field is considered, to which the rules of quantization are then applied. As is well known, this process builds in the self-energy of the particles, which presents considerable conceptual and calculational difficulties. In this paper we present an alternative approach which avoids the difficulties just mentioned. It will be seen that our scheme yields an unambiguous and valid 406 Uncertainty Principle and Foundations of Quantum Mechanics framework to deal with systems of electromagnetic particles beyond the electrostatic approximation, at least for moderate energies. We wish to retain as much as possible the concepts and methods of the preceding sections. Although we do not have a Hamiltonian operator, we still assume that with respect to an external observer the system can be described in terms of stationary states, as expressed by equation (7a). Since we no longer have a Schrodinger or a Dirac equation, the wave functions and energies will have to be determined from some other principle; developing such a principle is one of the key objectives of this work. Inasmuch as the energy is not a relativistic invariant, the assumption of stationary states is not a covariant one; rather we have chosen a specially simple form of representation for one Lorentz frame, namely the rest frame of the observer. Also, a consistent covariant formulation would require N time coordinates as companions to the N sets of space coordinates; our wave functions maintain N sets of space coordinates, but only one time. While we thus adopt stationary state functions of the type (7a), we must of course demand that ^ ma and ^ ma are appropriate generalizations of proper relativistic one-particle wave functions. We shall restrict ourselves to Dirac particles: we permit different masses and charges, but each particle, if alone, would be represented by a four-component Dirac spinor. In general, for N-particle wave functions the coordinate space is the direct product space of the N single particle spaces. Since the four components of a Dirac spinor may be considered to arise from a discrete variable capable of four values, the discrete aspect of the N-particle product space leads to a spinor with 4 N components. We label these components by the compound index s N = (s u s 2 , ...,s N ) (44) where s M refers to the components with respect to the [ith particle, so that 1 <s M <4. The stationary state spinor wave functions are given by *, a( r U r 2 ■ ■ ■ > r N \ ~~ ^sis „(ri, r 2 , . . . , r N )e" or, in condensed notation %» ma (r N ,t) = Vs», ma (r")e- (45) (45a) As in the single-particle case, the spinor index s N can usually be suppressed. When that is done, we understand ^ and ¥ to represent column vectors, the components being ordered by taking s t s 2 ■■■ s N in dictionary order; ^ and ¥* are of course the complex conjugate row vectors. Again, matrix algebra applies: ¥*$ is a scalar, and ^<t>* is a 4 N x 4 N matrix. In general, an operator A is a4 N x4 N matrix with elements A slS2 ... Wlt2 ...,„, or, in condensed notation A,",,". Taking the matrix elements of an operator with respect to the stationary state wave functions is again accomplished by equations (11), which implies complete summations over spinor components as well as integrations. Detrich and Roothaan 407 If a is any 4x4 matrix operator for a single particle with components a s „ we define the corresponding operator for the ptth particle, a M , by , o sl/1 d S2 , 2 . . . o s ^_ l ,^_ 1 a s ^, i b Si H'ji+l * .5. */*/fi\ In matrix notation, this is expressed by a li =lxlx .Ixaxlx ...xl (46) (46a) where x indicates direct multiplication of matrices, and a occurs as the /u.th factor. It is easily seen that if a M and b v are any two operators so constructed, we have [a M ,A„I = 0, (i*v (47) Among the one-particle operators which we can construct according to equa- tion (46) are the generalized Dirac matrices a^, j8 M , the coordinates and momenta r^, p M , and the charge density p M (r' ). Since we do not have a Hamiltonian in hand, time derivatives of operators cannot be defined by equation (4), but we must in general use the matrix element form, equations (16), instead. However, for the velocity r^ we desire a simpler definition in terms of fundamental operators associated with the /nth particle; this requirement is dictated by the fundamental role played by the current density in electromagnetic phenomena. Taking our cue from the single particle Dirac formalism, we adopt the generalization of equation (37) as an ad hoc postulate, namely i IL =OL ll c (48) Applying equation (21) we then find for the current density Ur') = <V>>') (49) analogous to equation (38). Another relation which cannot be derived in the absence of a Hamiltonian is the equation of continuity; because of the fundamental significance of charge conservation, we postulate its validity. Since p\ (r') is not defined as an operator, we must take the matrix element form of the equation of continuity, namely ik mn p», ma , nfi Of) + V • k, ma ,„0 Of) = (50) This is formally identical with equation (31), but the current densities are defined differently in the two cases. If we multiply equation (50) by an arbitrary function f(r'), integrate over r', and use equation (49), we obtain the equivalent form (E m -E n )(^ ma \f^ nP )-c(^ ma \lot li •p /i ,/J|^) = (51) If we take/^ = 1, equation (51) reconfirms part of the orthogonality relations, equation (9), namely for E m ^ E n . A more interesting result is obtained by taking / M -* r^ ; we find, using the commutation relation (3) ""mn* f>L,ma,nf} £**/A,ma,n0 (52) 408 Uncertainty Principle and Foundations of Quantum Mechanics Hence r M also satisfies the general definition of time derivatives, equations (16), as of course it should. We now proceed to the main task of this section, namely to write down a valid quantum mechanical expression for the energies E m . Again, the corres- ponding operator, H, is not available; nevertheless it is possible to obtain a valid expression for E m . We partition E m according to E m —E Dm +Ei t , (53) where £ D>m is the many-particle generalization of the Dirac energy, and E l<m is., the particle-particle interaction energy. The Dirac energy is given by where ■WmMV^) H D = I (fi^m^c 2 + a„ • p M c) (54) (55) is simply the many-particle sum of the individual particle Dirac Hamiltonians, without the potential energy. The latter has been omitted since we consider our system a closed system of electromagnetic particles; any additional energy is to be accounted for in the interaction energy. Obviously if E D , m were the only contribution to the energy (we could even add an external potential energy) we would have a Hamiltonian, namely H D . We shall see shortly that the interac- tion energy E Um cannot be written as the diagonal matrix element of an operator. To formulate the electromagnetic interaction energy of the N-particle system we follow Wheeler and Feynman (1945, 1949). In this view direct particle-particle interaction is paramount, and the electromagnetic field plays a subordinate role. The interaction energy is then quantized, that is, reformu- lated in terms of matrix elements over charge and current densities; the electromagnetic field is never quantized independently. The deviations from conventional quantum electrodynamics are important; they may be sum- marized as follows. (1). There is no such concept as 'the' electromagnetic field with degrees of freedom of its own. Instead, there is a collection of adjunct fields, each produced by an individual particle, and completely determined by the motion of that particle. (2). The prevailing field acting on a given particle is determined by the sum of the fields produced by every particle other than the given particle. The interaction of a particle with its own field does not occur. (3). The fields produced by the particles are always taken half -retarded and half-advanced. We note that this is the necessary and sufficient condi- tion that energy and momentum remain conserved — and therefore defined — within a finite, but perhaps very large, volume. In classical Detrich and Roothaan 409 electrodynamics, the half-retarded and half-advanced solution describes a system of particles that neither emits nor absorbs radiation. In the light of these remarks, we now write down the classical electromagne- tic interaction energy Ei = £ I I f dr'[^ (•, t)<t> v (r\ t) - ^ (r', t) • A„(r', t)] (56) ft V^fJL. where in general p M (r', t), j M (r', t), <^(r', r), A^r', t) are the classical time- dependent charge and current densities, and scalar and vector potentials associated with the /ith particle. The latter are given by </>>', = |Jdr" J R- 1 |> /i (r", t-R/c)+p^r", t + R/c)] A M (r\ t) = \ J M'R-XU(f, t-R/c)+U{f, t+R/c)] (57) where we used the abbreviation i? = |r'-r"| (58) To properly quantize this scheme we must introduce the time-dependent matrix elements for the densities and potentials, as explained above, equations (11) and (12). When that is done, the quantized equations (57) can be reduced to relations between time-independent matrix elements by dividing by the exponential time factor; the result is <t>„., ma ,ni3(r') = j dr"i? ] cos {k mn R)p^ ma , nP {r) K,™*,nfs(r') = j dr"i? _1 cos (fc TO ^)j^ mo ,^ 3 (r") (59) We now quantize equation (56) by introducing time-dependent matrix ele- ments in the right-hand side, and observing the rule for product formation, equations (15). When that is done, the exponential time factors cancel, and we obtain for the quantum mechanical interaction energy E i,m=iL I I dr'[p /t>ma>n/s (r')^ )f ^ jma (r')-j #t , mo , Bj8 (r')-A J , >B/s>m „(r')] (60) We can eliminate the potentials using equations (59) to obtain £i,«=5l I I fdr' f dr"*" 1 cos (*„,„*) X [P M ,ma,n/3 (t')p v , n p, ma CO ~ j M ,ma,n0 CO * \v, n fi,m a CO] (6 1 ) In this expression, the symmetry of the interaction between particles is apparent. We now combine the Dirac and interaction energies to obtain the total energy. Since we will soon need to consider the energies as functionals of trial wave functions, we display the wave functions and energies explicitly wherever they occur. We also need to generalize the formula in such a way that 410 Uncertainty Principle and Foundations of Quantum Mechanics invariance with respect to unitary transformations of degenerate wave func- tions becomes guaranteed and transparent. The resulting formula is E^8 a p={9 ma \H D \9 mfi ) +sZ I S f dr'J dr"*?" 1 cos [(E m -E n )R/hc] fj, v^ii, ny -WnJLrW*,) ' <*«rli.(Ol^>] (62) If the factor cos [(E m -E n )R/ he] were absent in equation (62), we could carry out the summation over ny using equation ( 1 0) ; E m would then reduce to the diagonal matrix element of an operator. Hence it is the retardation- advancement effect of the electromagnetic interaction, expressed in the factor cos [(E m -E n )R/hc], which prevents the definition of a Hamiltonian for our relativistic many-particle system. In a Hamiltonian case in general, if the energy expression is considered as a functional of a trial wave function, we can apply the variation principle to this expression: we demand that the energy be stationary to first order for any change in the wave function which preserves normalization. As is well known, this leads to the Schrodinger (or Dirac) equation. In the present case we do have an energy expression, but even formally there are important differences with the Hamiltonian case. In the Hamiltonian case the energy E m is expressed, explicitly and bilinearly, in terms of any one of the wave functions ¥ ma belonging to the level m. Our equation (62) on the other hand is an infinite set of implicit transcendental equations, each equation containing all E m and all V np . We consider that equation (62) nevertheless defines the E m as functionals; we must however expect that each E m is a functional of all ¥„£. We now adopt the variation principle in the following form. We demand that all E m are stationary simultaneously to first order for any change in the set of trial wave functions ^^ which is constrained by orthonormality, equation (9), and completeness, equation (10). We furthermore expect that charge conser- vation, equation (51), holds for the solution of this variational problem. A direct attempt to derive practical equations for the wave functions from this principle, as a replacement for the Schrodinger-Dirac equation, does not appear to be a simple matter. We shall see, however, that it leads to a straightforward and orderly procedure within the Pauli perturbation scheme. 5. PAULI APPROXIMATION FOR A SINGLE RELATIVISTIC PARTICLE The Pauli approximation is based on the assumption that the wave functions and energies can be expanded using c~ l as the expansion parameter. Writing ; Detrich and Roothaan 411 we proceed from the assumption that the Hamiltonian, energies and wave functions are analytical functions of A. For the sake of clarity, we shall in the following display the dependence on A explicitly whenever appropriate; hence we shall write H D (X), £ m (A), ¥ M (A). The Dirac Hamiltonian, and consequently the energies, are obviously of order A ~ 2 . It is convenient to introduce the scaled Hamiltonian and energies defined by t / (A) = A 2 // d (A) = M + j PA + VA 2 (64) £m (A) = A 2 £ m (A) (65) The scaled Dirac equation is 7,(A)¥ m „(A) = e m (A)* ma (A) (66) In equation (64) we introduced the operators M = pm' (67) P = ap (68) M is the rest mass operator, and P may be considered the momentum magnitude, since the commutation properties of the Dirac matrices assure that P 2 = p-p (69) It is useful to establish first some properties of the solutions of the scaled Dirac equation (66) for A = 0. In preparation for the many-particle case, we shall make a distinction between the operators M and j8, although they are in this case proportional, see equation (67). We have M¥ ma (0) = £m (0)¥ ma (0)j /3* ma (0) = r m ^ ma (0) J Since /3 is unitary Hermitian, the eigenvalues are given by e m (0) = T m m'| (70) (71) We say that ^ m „(0) has rest mass ±m' for r m = ±1. At this point the wave functions ^ ma (0) are still highly degenerate. In fact, all we can say so far is that for positive/negative rest mass the lower/upper pair of spinor components vanishes. This leaves completely undetermined the dependence on the space coordinate r, and on the spin coordinate, the latter being the discrete index labelling the non-vanishing spinor components. Because of the commutation properties of the Dirac matrices we have k=c (63) /3ij(A)/8 = tj(-A) and therefore, from equation (66) r, (-A )(3V ma (A ) = e m (A )pV ma (A ) (72) (73) 412 Uncertainty Principle and Foundations of Quantum Mechanics Clearly, the operator j8 maps the eigenf unctions of the level m of 17(A) into those of some level, say n, of tj(-A), such that e m (A) = e„(-A). This mapping m-*n remains valid for continuous changes in A, and in particular holds for A = 0. But for A = the mapping by /3 is stated in the second equation (70), so that m = n. Hence we have proved that £m(A) = e m (-A) (74) and that pV ma M is a linear transformation of ^ m/3 (-A). In Appendix A it is shown that the ^ ma (A) may be chosen so that the transformation induced by /3 takes on the simple canonical form <3^ ma (A) = T m ¥ ma (-A) (75) We assume in the following that equation (75) always holds. We now put forward the Pauli perturbation expansion for the wave functions and energies, namely * m „(A)= I *™ >P A f p = (76) «m(A)= I £m,pA p = 2p where we have limited the energy expansion to even powers of A, because of equation (74). From equation (75) follows the important relation |3* ma , p = T m (-l) p *„ (77) The fact that the leading term in the scaled energy expansion is always finite points to a limitation of the Pauli perturbation expansion. Namely for a finite value of A there are stationary states of arbitrarily high scaled energy; connect- ing these to a finite scaled energy for A = cannot be achieved by a uniformly convergent process. Hence the Pauli perturbation expansion is at best a semiconvergent process, and is practical perhaps only for states with energies close to the rest mass energies. We now proceed to determine the wave functions and energies in more detail. Substitution of the expansions (76) into the scaled Dirac equation (66) should give us the necessary and sufficient equations to determine the wave functions and energies term by term for each power of A. An immediate simplification is obtained from the relation (M-e m>0 )^„ [0, ' hi 2M^„ p evenl p odd J (78) Detrich and Roothaan 413 which is easily proved using equations (67), (70) and (77). The equation arising from A is identically satisfied; the next four are 2M* ma , 1 +P^ ma , = 2M^ m „, 3 + P^ ma , 2 + (y-e m , 1 )^ ma>1 = PVma,3 + (V~ £ m>1 )f ma ,2 ~ B m ,2^ma,0 = (79) The first equation (79) is solved by * ma,l = "'ma,0 where K is the anti-Hermitian operator defined by (80) (81) It is interesting to note that equation (80) permits ^ maj i to be calculated directly from ^^o, whereas the latter is still relatively undetermined as a solution of a highly degenerate eigenvalue problem. Substituting ^ ma> i from equation (80) into the second equation (79) we obtain where T is the Hermitian operator defined by T=(2mT 1 /3p-p (82) (83) For wave functions of positive rest mass the operator (3 acts like the identity, and equation (82) becomes equivalent to the non-relativistic Schrodinger equation for a particle with spin one half, moving in a field represented by a spin-independent Hamiltonian. We now proceed to calculate the lowest order relativistic correction to the energy, e m2 . Taking equation (80) and the last two equations (79) we can eliminate 4 r m „,i and ^3; the result is (T+ V-e mti n maa -[K(V-e m JK + £ m , 2 ]* ma , = (84) Taking the scalar product with ^ maj0 and using equation (82) we find for the energy correction e m ,i = <* m „,ol*(£ m ,i - V)K|*™,o> (85) The expression (85) can be transformed to yield terms which can be interpreted as representing specific physical effects. We eliminate e mA using equation (82), applying half of the operator T+ V to the right and the other half to the left; we obtain S m ,2 = <^m«,o| T + Visile) (86) 414 Uncertainty Principle and Foundations of Quantum Mechanics where r = j(K 2 T+TK 2 ) (87) V ts = a^,[A,VM (88) The operator T represents the relativistic mass correction. Using equations (81) and (83) one easily finds r=-(2mT 3 j8(p-p) 2 (89) It is to be noted that T as given by equation (89) is not properly Hermitian if the potential energy V is due in part to point charges. In that case the wave function W^o may have a mild discontinuity (cusp) at the site of a charge. Hermiticity of an operator involving p depends on partial integration, and the vanishing of the surface integrals occurring in that process. It turns out that for the wave function * ma , the operator p • p causes no problem in this respect, but (p • p) 2 does. In a sense this difficulty was created artificially when we replaced e m>1 in equation (85) by T+ V; T and V introduce compensating singular behaviour when operating on ^^.o- In practice, a simple remedy for the non-Hermiticity of T consists in applying one factor p • p to the right and the other one to the left. For the further interpretation of V^ we introduce the four-component generalization of the two-component Pauli spin vector which satisfies ct= — 2<aXa o , xo- = 2j'ct (90) (91) The angular momentum operator which represents the spin of the particle is s = §fto- (92) If a and b are any Dirac-diagonal vector operators, we have the useful relation (a-a)(a-b) = a-b + MT-(axb) (93) Equations (90), (92) and (93) are used to derive our final expression for V^ : V LS = lm'- 2 [(Wxp) • s+|ft 2 AV] (94) The first term in equation (94) is the usual spin-orbit coupling; the second term, called the Darwin term, is often said to represent the Zitterbewegung. When V is due to point charges, A V vanishes except at the charge sites. Careful analysis shows that in this case AV yields a delta-function. Since the two terms of equation (94) really belong together, we shall in the following call V^ the spin-orbit interaction operator. Finally we note that the scheme presented so far does not make allowance for an external magnetic field, and the interaction of the spin with it: we introduced Detrich and Roothaan 415 a potential V which can be specialized to Coulomb potentials due to external sources, but we did not introduce a corresponding vector potential. The reason for our omission is that it is not straightforward to introduce a vector potential from an external source without bringing about inconsistencies; in particular, the mapping ^ m „(A)<-»^ ma (-A) with the operator /3 is no longer valid. Since the primary purpose of this paper is to treat interactions, we shall not dwell upon this any further. Actually, in the next section we will treat those magnetic and spin effects which arise from the interactions between electromagnetic particles, all of which are part of our quantum mechanical system. 6. PAULI APPROXIMATION FOR A RELATIVISTIC MANY-PARTICLE SYSTEM We again consider the wave functions to depend parametrically on the variable A = c _1 , and adopt the notation E m (\), ^ ma (A). We generalize the operators M, P, K, T, defined before for a single particle in equations (77), (68), (81) and (83), for the many-particle case: ^ = I^=I(2m^r 1 a /i -P^ T=I T M =1 (2m^)" 1 /8 tl p /i • p M (95) Note that M, P, T are Hermitian, and K is anti-Hermitian. The following commutation and anti-commutation relations are easily verified: [m,tj = o j {MP} = {M, K} = {P, K} = {P,T} = {K, T} = 0\ The operators P and T can be written as commutators, namely P = \K,M\ T=- l 4K,Pj=-^K,lK,MM (96) (97) (98) Equations (97) and (98) are specific examples of a particularly useful device: namely if we take the commutator of two operators, each of which is the sum of one-particle operators, the result is again a sum of one-particle operators. Note that there is no analogue of this for an anti-commutator. 416 Uncertainty Principle and Foundations of Quantum Mechanics The explicit formula for the scaled total energy, with A displayed as a variable, is readily obtained from equation (62); we write e m (\)S aP =<* ma (A)|M+AP|* m „(A)> +5A 2 I I I fdr' f dr" J R- 1 cos{[e m (A)-e„(A)]/?/ftA} u. i/#u ny J fj. v^fi ny • x[(^ m «(A)|p^(r')|* n ,(A))<^ nr (A)|p,(r")l^^(A)> -<^(A)|j>')hMA)> ' <^(A)|j,(r")^(A)>] (99) Similarly we rewrite the various conditions on the wave functions, with A explicitly displayed. We obtain for the orthonormality and completeness (100) (101) (102) (103) (104) <*m„(A)hMA)> = S ma ,„0 E|¥ ma (A)><* ma (A)| = ^ ma Charge conservation is expressed by the two equivalent statements [ £m (A)-e„(A)K m «,^(r', A)- i\ hV • U m „, n/3 (r', A) = [ £m (A)- e „(A)]<^(A)y*„e(A)> -A<¥ ma (A)|[P,/jrMA)> = where P„e(r', A) = <* m «(A)|p,(r')l^(A)> 1 U m ^(r',A) = (^ m „(A)|j /i (r')|^„ /3 (A)> J We emphasize again that f„ is a function of the position r„ only. Before proceeding further with equations (99) to (104) we introduce another useful operator, namely P=PiP2---Pn ( 105 ) obviously /3 is unitary Hermitian. We shall see shortly that this operator is the many-particle generalization of the one-particle operator for the purpose of the mapping ¥(A ) -* ¥(-A ). However, the many-particle has no direct simple connection with the rest mass operator M; equation (67), or anything like it, does not hold for the many-particle case. In connection with other operators, the following commutation and anticommutation relations are useful: |[j3,P>')]l = «/3,j>')} = (106) Detrich and Roothaan 417 As in the one-particle case, it is important to have in hand the wave functions and scaled energies for A = 0. Clearly, for this limiting case our system becomes Hamiltonian, with the eigenvalue equation MV ma (0) = e m (0)V ma (0) (107) Equation (107) is formally identical with equation (70) for the one-particle case. Actually, equation (107) is separable into N such one-particle problems. We take for the wave functions ^ m „(0) the direct products of the one-particle solutions; the eigenvalues are given by .,o = I' = ±1 (108) To obtain all possible eigenvalues, one must in general take all possible combinations of + and - signs for the different values of p ; this corresponds to the individual particle spinors having positive and negative rest masses, respec- tively. We say that the spinors with T mji =-1 constitute holes in the wave function ^ ma (0). If any of the particles have equal masses, there is additional degeneracy for e m (0), but that need not concern us at this moment. For the direct product wave functions we have /3^ ma (0) = r m ^ ma (0) r m =(-l)^-^" (109) Note that according to equations (107) and (109) the ^ ma (0) are simultaneous eigenfunctions of M and /?; this is of course possible because M and j8 commute, see equations (106). Evidently r m = 1 or r m = -1 holds for a wave function with an even or odd number of holes, respectively. Accordingly we call /3 the hole parity operator. The wave function ^ ma (A) does not possess hole parity, but ^ ma (0) does possess it according to equations (109). If an operator commutes/ an ticommutes with /3 it will preserve/reverse hole parity. We now return to equations (99) to (104) for A # 0. With the help of equations (106) it is not difficult to see that equations (99) to (104) remain valid if we replace each wave function symbol ^ by /3¥, and change the sign of A whenever A occurs as an argument of s, p, j or ¥ (not when A appears algebraically). Suppressing indices for a moment, we can say therefore that the e (-A ) are the same functions of the /3 ¥(-A ) as the e (A ) are of the ¥(A ). Clearly the physical solutions for the two cases, which occur when all e(A) and s(—X) are stationary, must be mappings of each other. More precisely, there is a correspondence m++n so that e m (-\) = e n (A), and the /3^ m „(A) are linear transformations of the ^ n(3 (-A). Continuity for A = together with equations (109) establishes m-n\ and Appendix A again justifies the canonical form of the transformation induced by j8. In summary, we have proved that for the many-particle case the dependence of the energies and wave functions on the expansion parameter A has again the 418 Uncertainty Principle and Foundations of Quantum Mechanics special properties expressed by equations (74) and (75). We conclude, then, that the Pauli perturbation expansion, as expressed by equations (76), also holds for the many-particle case, as does equation (77). The latter equation takes on new significance for the many-particle case; it expresses that the expansion functions * ma , p have definite hole parity, alternating for even and odd p. . For the charge and current densities the mapping by p yields the simple result Pp,ma,np (*'>*)- TmTnP^macnP ( F '> A ) j^,ma,n^(''> A) = ~T m T n \p, ma ,nf} (*' , _A ), (110) We now pursue in more detail the consequences of the perturbation expan- sion (76) with respect to orthonormality, completeness, charge and current densities, charge conservation and the energy, equations (99) to (104). For the orthonormality and completeness conditions the results are simple, namely p q=0 ma *j=0 (111) (112) For the charge and current densities we obtain the expansions Pn,ma,np( T ' > ^) ~ L P,i,ma,n0,p( T M p = oo ],j.,ma,np ( r > ^ ) = 2- }p.,ma,np,p \ T M p = (113) where <,=o (114) For the charge conservation condition we get different equations for the even and odd powers of A, namely I (£m,q - e n , q )P».,ma, n p,2p-2q(*') ~ lftV ' ' U,nu,,nfi,2p-1^) P I q=0 P I (e m ,q - e n ,q)P ^mu,np,2p-2q + \^') ~ W ' U,mc,nfi,2p(^) <j=0 (115) Detrich and Roothaan 419 (116) or the equivalent form P 2p-2q 2li \ e m,q~ e n,q) L* V*ma,2p-2q-rl//il *n0,r/ q=0 r=0 ~ 2 "l (^ ma ,2p-r-l\lP, fJ\Vn P ,r) = r = p 2p-2q + l i \ £ m,q ~ E n,q) 2- \*ma,2p-2q-r+UJfi \*np,r) q=0 r=0 -I (V ma ,2p-MP,fJV nf) ,r) = r = In the first equations (115) and (116) the current term is meaningless for p = 0; the correct interpretation is to omit the offending term. In order to apply the perturbation expansion to the energy formula (99) it is convenient to introduce the functions + nma$ (' > •"> A ) = 1L L 2* [fip.,ma,ny( r i ^)Pv,ny,mpV^ > ^) — J/i,mo,ny( r > A) • }^„^ m p(f , A)] (117) ji. v^p. y Note that, because of equations (110) F nma p (>"', *", A ) = F nma p (l' , t", — A ) (118) Using equation (117) we can rewrite the energy formula (99) in the simpler form e m (X)S aft =(V ma (X)\M+\P\V mfi (\)) +A 2 I [ dr' [dr" J R- 1 cos{[ £m (A)- e „(A)]i?/ftA}F„ maP (r',r",A) (119) We now apply the perturbation expansion (76) to equation (119). The Dirac term can be handled like the orthonormality condition, and poses no new problem. The interaction term is also straightforward with respect to the function F nma p(r', r", A), for which we have the expansion Fnmapi 1 ', 1 ", A)— £ ^nma/3,p( r \ r ")^ p = n v 2p (120) where 2p '"nmaftpl'l' ) 2^22 2, lPp.,ma,ny,2p-q( r )P •>,ny,m0,q{* ) q = fj. v^fx, y }p.,ma,ny,2p— <}(.' ) ' J v,ny,m0,q'X )} (*■*•*■) 420 Uncertainty Principle and Foundations of Quantum Mechanics The cosine factor in equation (119) poses a new problem, because of the manner in which A appears in the argument. We find for this argument the expansion [e m (X)-e n (\)]R/h\=Rh 1 I (e m , p - e„, p )A p = 2p-l (122) The first term in this expansion is of order A if e mfi = s nfi , and of order A if e m0 # Enfi . The cosine will have to be treated quite differently for these two cases, and the summation over n in equation (119) has to be split up accord- ingly. For this purpose it is useful to define the sets &(m) and «(m) by means of nc<g(m), £n,0 = £ n,0 ^ £ m,0 n J (123) so that 1= I + I n n<=&(m) nc<l(m) (124) If the argument of the cosine is of order A, we can use the power series expansion of the cosine. Hence we have up to order A 4 cos{[e m (A)-£„(A)]i?/ft}=l-ir 2 ( em , 1 - £ „. 1 ) 2 i? 2 A 2 + (?(A 4 ), n^${m) (125) On the other hand if the argument of the cosine is of order A ~\ the power series expansion of the cosine is useless. We can however develop the integral in a power series of A. We start from an asymptotic expansion which is proved in Appendix B, namely the expansion of the operator R' 1 cos (kR) for fc-»oo: R 1 cos (kR) = -47r5(r'-r") I fe- 2p - 2 (V • V") p (126) P =o We have to apply this formula, where k is itself an odd power series in A starting with A -1 . The result is, to order A 4 R' 1 cos{[e m (\)-e n (\)]R/h\} = -47rft 2 ( £m>0 - e„,o)" 2 8 (r' - r")A 2 + <?(A 4 ), n^<S{m) (127) We can now put together the energy expansion up to order A 4 . Because of hole parity, we need to consider only even powers; we obtain Detrich and Roothaan 421 i? = <j = + I fdr'fdr-R-'F^o^r") q=0 q=0 + I fdr'fdr^F^^r") - \tT 2 I ( £m>1 - en>1 ) 2 f dr' f di"RF nma , fi (t', r") -4irh 2 I (e m , - e„, )- 2 f dr' f dr"5(r'-r")F„ maP>0 (r', r") n<=<S(m) (128) In the last term of e m2 we could of course have carried out one integration; however the expression as given will turn out to be more convenient in the development which follows. We are now ready to apply the variation principle. In the case of a perturbation expansion, the proper procedure is to apply the variation process successively for each power of A. We demand of course that the wave functions are constrained by orthonormality ; we expect that our variational solutions can be chosen so that they are eigenfunctions of /8, and that they satisfy the completeness and the charge conservation conditions. We apply the variation principle to the first equation (128), which arises from A ; we also must honour the corresponding orthonormality constraint, equa- tion (1 1 1) for p = 0. This is the usual Hamiltonian variational problem, and we obtain of course again equation (107). The solutions ^ ma , are taken as direct products of one-electron spinors; £ m>0 is the rest mass of ^ ma ,o- If there are identical particles, proper linear combinations of direct products can always be taken so that the ^ ma , have appropriate symmetry properties with respect to permutations of particles. We feel confident that the eigenfunctions of M span the entire Hilbert space, so that the completeness condition (112) is satisfied for p = 0. In Appendix C it is shown that the charge conservation conditions which depend on W ma>0 are also satisfied. We now turn to the second equation (128). First we evaluate the interaction term. For the charge and current densities we have the obvious identities \M, Plx (r')] = j^(r') = (2m^r 1 lM,^j^(r')] (129) (130) (131) 422 Uncertainty Principle and Foundations of Quantum Mechanics Taking matrix elements with the zero-order wave functions we obtain (Sm.O - £ n ,o)P M ,ma,n/3,o( r ') = 1 t, m «,n^,o(r') = (2m M )- 1 ( £m ,o-£„,o)<^ m «,ol/3 F j^(r')|^,o) J We conclude from equations (130) that P /ljma ,^,o(r') = 0, »c»(m)| k ma ,^,o(r') = 0, ncf(m)j Using these relations we may write Z fnma^olf 1 ! O = 2 Z Z Z Z Pfi,ma,ny,o( r )P>viy,m0,o( r ) = zl X X<^m«,o|p^r')l^,o)<^,o|p,(r")l^m ft o) = 11 X <^ ma ,o|p^(r')p,(r")l^,o) = 1Z X e M e„<¥ m ^S(r'-r>(r''-r,,)|¥ m/3 ,o> In obtaining the final result (132), we first dropped the current density term on account of the second equation (131); next we extended the summation over states to all states, on account of the first equation (131); next we used the completeness relation, equation (1 12) for p = 0, to carry out the sum over states in closed form; and last, we used the definition of the charge density, equation (19). Using this result in the interaction term in the second equation (128), we can carry out the integrations, and obtain Z f dr' f dr"R "^^(r', r") = <¥ ma , | V\V mpfi ) (133) where V is the usual Coulomb interaction operator, namely V = \! Z e M e„r^ (134) with the common abbreviation V = l«wl = l , V _r *l ( 136 ) If we substitute the Coulomb energy expression (133) into the second equation (128), we note that the same result would have been obtained if the scaled Dirac Hamiltonian had had an additional term A 2 V. Hence up to order A 2 and the energy e ml we do have a Hamiltonian formulation as a valid (132) Detrich and Roothaan 423 description of the electromagnetic interaction. The second equation (128) now becomes explicitly s m ,iS aP = (V ma , 2 \M\V mlifi ) + (^ ma ,i|M|^ m/34 ) + {V mafi \M\V mfi , 2 ) + <*™,il ^ m 0,o> + <* m a,ol-P|* m 0,i> + (V ma ,o\ V\V mfi ,o) (136) We eliminate ^ ma<2 and ^ m p, 2 by taking the orthonormality condition, equa- tion (111), for p = 2, multiplying by e m0 , and subtracting the result from equation (136). Making use also of the commutator expressions for P and T, equations (97) and (98), and of equation (107), we obtain after some manipula- tion s m ,i8 aP = (V ma ,i ~ KV^M- £ m ,o|¥ m/3il - Ky mpfi ) + <^ ma>0 | T+ V|¥ m 0,o> (137) We now apply the variational procedure to equation (137). We demand that e m>1 be stationary for variations in ^ m « >0 and ^ r ma>1 , maintaining the relevant orthonormality constraints, equation (111) for p = and p = 1. As far as the variation is concerned, we specifically do not impose the other known condi- tions which the wave functions have to satisfy: completeness, charge conserva- tion, W ma # eigenf unctions of M and /3, ^ ma ,i eigenf unction of j8. It will be seen, however, that the variational solutions obtained permit a choice so that all these other conditions are indeed satisfied. The variations with respect to ¥„,«,, i and ^o yield, after the usual determination of the Lagrange multipliers (M- e^oX*™,! - KV mafi ) = (138) (T+ V)V mafi = e m)1 ^ ma , (139) Since T and V commute with both M and /3, equation (139) permits a solution so that ^ m «, is a simultaneous eigenf unction of M, /?, and T+V. We can also say that equation (139) removes a substantial part of the degeneracy inherent in the ^ ma , up to this point. For the no-hole solutions, equation (139) is equivalent to the non-relativistic Schrodinger equation for N particles with spin one half. The general solution of equation (138) is *ma,l — "inw.O"'" * nux.O (140) where ^ is any function of rest mass e m0 . The requirement of hole parity however demands that we restrict the solutions (140) to those where ^q and ^L*,o have opposite hole parity. Hence we must have *Uo = (141) unless there exists an even subset of the N particles with the same mass as some odd subset. Obviously this is a rather special case of accidental degeneracy. It cannot occur for a system of identical particles; since this is our primary concern, we assume from now on that equation (138) holds, so that *ma, 1 ■** ™ma,0 (142) 424 Uncertainty Principle and Foundations of Quantum Mechanics Using equation (142), we easily verify the relevant orthonormality and completeness conditions, equations (111) and (112) for p = \. For charge conservation we again refer to Appendix C. We now turn to the third equation (128), which arises from the terms of order A 4 . To evaluate the contribution due to F nmaPA we start from the identities (143) IK, p>')J = (2m M )" 1 [M, jSjtf, p>')I | The proofs are elementary. We note further that from equations (114) and (142) follows P„, ma , nP ,i(r') = -<*-«,.o|[*, Pm(01|^o> U, ma ,ne,i(r')= -W ma , \lK, U^ll^o) . (144) Hence taking matrix elements of equations (143) with the zero-order wave functions yields P^,ma, nft i(r')= -(2m^) \e m ,o-i n ,o)(V ma ,o\PA K >PA t ')l\Vnii,o)} (£m,0 £n,o) in,ma,np, 1 ( r ) — (145) from which we conclude P»,ma,nf},l(r') = 0, «CZ^(m)j Uma,^,i(r') = 0, ncS( m )J (146) Figure 1 Transformation of triple sum Detrich and Roothaan 425 With the help of equations (112), (114), (121), (131), (142) and (146) we proceed to simplify F nmaPA , using techniques similar to those used to derive equation (132). We obtain Z ^nmafrlfr'* *") n<=^(m) = 2 Z L L LLPn.ma.nyaytjPv.ny.mpfiil ) +P/x,ma,ny,o( r )Pv,ny,mp,2\* ) ~ }li.,ma,ny, l(T ) * J v,ny,mp, 1 V r )J r 2 = 2L, L* 2j \ L. P n,ma,ny,2-p\ r )P v,ny,mp,pV^ ) fjL v^fs. ny *- p=Q ~ P ii,ma,ny,l(* )Pv,ny,mp,l\X ) J^t.ma.n-y.lvr ) " Ji>,ny,m0,lv* / I = jI Z if Z Z" Z <^ m «.2- P -> M (r')|^ r .,>(^ r . r |p,(r")|^, P - r ) -(V ma ,o\lK, P„(r')l\^ nyfi )(^ nyfi \lK, p,(r")l|^ m ^ ) = iZ Z Z 2 t Z Z Z <*™,,-,|p>0l^-,X^Jp,(OI^,2-r> r=0s=0r=0 -<*-».olI/^, P>')]II*,, y ,oX^,oll[^, pA<W?»*fi> = zZ Z f Z <^ ma ,,|p^(r')p,(r")l^,2-,) ft VT±p. '■1=0 -<*»a,olI^ I*« P^(r')p,(r")+j^(r') • j„(r")JJI^,o>] (147) In deriving this result we converted the triple sum over p, q, r by the substitution p = 2 + r-t) a = s — r J (148) the corresponding transformation of the summation limits is 2 2~ p p 2 t s z z z = z z z p = 0<7=0r = »=Os=0r=O (149) as illustrated in Figure 1. 426 Uncertainty Principle and Foundations of Quantum Mechanics Integration of (147) over r' and r" with the factor R~ l readily yields for the first interaction term of g m2 I [ dr' f dr"R- l F nma ^(r', r") = £ <* ma ,,| V\^ m0 , 2 .,) n<=3f(m) ■> J 1 = (150) In order to evaluate the second interaction term of e m2 we need to absorb the factor (e m ,i — e„,i) 2 into the matrix elements. Such a relation is conveniently provided by the equation of continuity, namely the first equation (115) for p = l, ncz 3F{m), yielding (£m,i-e n ,\)p l L, m , l ,n f i,o(r') = iW •'u, ma , n p,i(r'), n<=&(m) (151) With the help of this we obtain —fr I (e m ,\-e n ,i) F nmaP #(i',i") = —4ft I II I ( e m,l~ e n,l) P ^ma.nyfii* )P i>,ny,mp,o(l ) n^&(m) fi. v^ft y = -| I I I IF-U^^F-Lw*!^] = -5 1 I I W ma ,o\lK, V • UVW^ x <^, |[X, V" • U(r")m mP ,o) ft v^fi. ny = ~zl I <*™,o|[*v, \K„ [V • j M (r')][V" • j,(r")]ll|^,o) (152) Integrating (152) over r' and r" with the factor R, we use Gauss's theorem twice, for V and V". We get then for the second interaction term of e m?2 *~ 2 I (em,i - e n ,x? f dr' f dr"RF nmaPfi (r', r") = -sZ I f dr' f dr'X^^.ol^, [#,,, [j>') • V'][j„(r") • V"]i?]]|^ >0 ) = -?!! e^e,<^ ma , |^,|[^,(a^-^)(a |/ -V >/ )r M Jl|^ mft o) H v*n = ll I e^e„<^ ma>0 ||[i^, \K„ rj,a^ • a„ -»"^(a M T^)x(a F 'i^Wpn&J (153) Detrich and Roothaan 427 The third interaction term is evaluated in a similar manner: we need now to absorb (e m)0 -e„,o)~ 2 into the matrix elements. The second equation (130) provides the required relation, namely (e m ,o-e n , o r%, ma , nM (r') = (2m^)" 1 <* ma ,o|i8 M j fl (r')|^o) (154) To prepare for the completion of the summation over all states we need one more relation. Starting from i8 M j^(r') = (2/n^)- 1 [M,j^(r')l (155) we obtain, taking matrix elements (^ m a,ol/3 M j M (r')|^,o> = (2m F )- 1 (e m , - e„, )j M , ma , nP ,o(r') (156) from which we conclude <¥.~,o|/8J M (rO|¥nfto> = 0, n^&{m) (157) Using equations (112), (121), (131), (154) and (157) we obtain -47r/i 2 I (e m , -e nfi r 2 F nmaPfi (r',r") = nc»( m ) = 27T fl 2 III I,(E m ,o-e n ,o)~%,m a ,ny,o( r ')'h,ny,mf),o(r") n<^^(m) /j, v¥=$ju y = -lirft 2 I I I Km^mJ" 1 x<*m«,o|/3,J>')|¥ n% o> • 0P„ y ,ol&j,(r'')i^,o> = -kft 2 I I I Km,)" 1 ft vi^fjL ny x<*»«,ol/8jM(r')|¥ I ,r.a> • <¥ B% ol&MOI*-*.o> = -\irh 2 1 I KmJ-W^ol^^dO • j„(r")|* m/3 , > (158) Integration over r' and r" with the factor d(r' - r") yields for the third interaction term of e m 2 -4^ft 2 I ( em ,o-£„ >0 )" 2 f dr' [ dr"5(r'-r")F„ ma/3i0 (r', r") = -\ttH 2 Y. I (e^e„/m^m„)<^ m „,o|5(r /i -rJ/3 M /3 v a^ •aj^p.o) /& v?*fJL (159) 428 Uncertainty Principle and Foundations of Quantum Mechanics We now collect the interaction contributions to e m _ 2 , equations (150), (153) and (159), and substitute them into the third equation (128). The result is e m . 2 S afi = I (^ ma A-cM^ m f } , q )+ I (V ma ,3- q \P\V mP , q ) </ = <J=0 + I <^m«,2-J^ m «,) + <^ m( ,,o|V'+5+ J B'|^,o) <7=0 where V=-|Z I ^AK^K^r-'j (160) (161) fj- V^fX B = -II I e^ej/^, \K„ r Ja M • a„ +r^(a M • !>)(«„ ■ r M „)I (162) B'=-jTTh'£ X (e„ e^/m^mjg^ -rJjS^a,, ■ a„ (163) We note that the Coulomb operator V has again joined the Dirac Hamiltonian in a natural way. The remaining terms with the operators V, B and B' are evaluated with the zero-order wave functions, hence they do not participate in the further determination of the wave functions. If one retraces the origin of the term V, it is seen to represent a higher order residual Coulomb interaction. On the other hand B and B' are due to magnetic interaction and retardation; they represent radiative particle-particle interactions. Before applying the variation principle to equation (160) we carry out a few more simplifications. Among other things, we eliminate the third and fourth order wave functions. We take the orthonormality condition (111) for p = 2 and p = 4, multiply by e ml and e m>2 , respectively, and subtract these equations from equation (160). We observe further that (164) (M - £ m ,o)^ ma ,l + PVmafi = 1 PVrna,i + (V-e mA n mafi = ~\{M - e mfi )K 2 V mafi J which are easily proved; the net result of all this is em*** = {y ma ,2-\K 2 y mafi \M-E mfi \V mfi , 2 - : kK 2 y mfsfi ) + (V ma ,o\K(e m , 1 -V)K + } i K 2 (e m , -M)K 2 + V' + B+B'\V mf3 , ) (165) We now apply the variation principle to equation (165). By the usual techniques we find (M - e mfi ){V ma , 2 - ^K 2 V mafi ) = The general solution of equation (166) is (166) (167) Detrich and Roothaan 429 where ¥L»,o is any function of rest mass e m0 . The second-order energy is clearly not affected by *L,o; the latter can only be determined fully by applying the variation principle to higher order. It is of course necessary that ^ m „ >2 as given by equation (167) does not violate the orthonormality, completeness and charge conservation conditions. It is easy to show that orthonormality and completeness are satisfied provided that wuoi^w + <* ma , m*,o> = o (168) That the applicable charge conservation conditions are satisfied is again demonstrated in Appendix C. On account of equation (166) the second-order energy is given by the last term of equation (165) only. We simplify this expression further by eliminating £ m ,o and £ ml using equations (107) and (139), applying half of the operators M and T+ V to the right and half to the left. After some more manipulation, the result can be written in the simple form where e m , 2 8 af} = <^ ma>0 | T'+V LS +B+ B'|^ mj8 , ) V LS = 2 fK,lK,VE+V (169) (170) (171) The operators T and V^ defined by equations (170) and (171) are the proper many-particle generalizations of the corresponding operators for the single-particle case; in fact if we specialize equations (170) and (171) for a single particle, equations (87) and (88) result. It is not unimportant that the operator T is now in general defined as a double commutator where all the participating operators are sums of one-particle operators; this guarantees that T is again a sum of one-particle operators, which of course it must be if it is to represent the relativistic mass correction. Straightforward evaluation of the double commutator yields r=-E(2m M rX(PM-p^ 2 (172) For the evaluation of matrix elements with T the same caution applies as for the single-particle case: one factor p^ • p F should operate to the right, the other one to the left. We now proceed with the evaluation of the spin-orbit interaction operator V LS , equation (171). If we expand the operators K in the double c